LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 40527 - Missed opportunity to remove a dead store
Summary: Missed opportunity to remove a dead store
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Scalar Optimizations (show other bugs)
Version: trunk
Hardware: PC Linux
: P enhancement
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-01-30 05:56 PST by Alexander Potapenko
Modified: 2020-10-22 06:00 PDT (History)
16 users (show)

See Also:
Fixed By Commit(s): 51ff04567b2f8d06b2062bd3ed72eab2e93e4466


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Potapenko 2019-01-30 05:56:10 PST
When compiling the following code:

============================
  void alloc_sock(int *err);
  int try_send();

  int send(int len) {
    int err;
    if (len) {
      err = -1; 
      alloc_sock(&err);
      err = try_send();
    }
    return -1; 
  }
============================

with 
 $ clang -O2  -g -ftrivial-auto-var-init=pattern -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang -c t.c

Clang produces an extra store of $0xaaaaaaaa to |err|, which is introduced by the -ftrivial-auto-var-init and not removed by other optimizations in the pipeline:

$ objdump -d t.o

0000000000000000 <send>:
   0:	50                   	push   %rax
   1:	c7 44 24 04 aa aa aa 	movl   $0xaaaaaaaa,0x4(%rsp)
   8:	aa 
   9:	85 ff                	test   %edi,%edi
   b:	74 1d                	je     2a <send+0x2a>
   d:	c7 44 24 04 ff ff ff 	movl   $0xffffffff,0x4(%rsp)
  14:	ff 
  15:	48 8d 7c 24 04       	lea    0x4(%rsp),%rdi
  1a:	e8 00 00 00 00       	callq  1f <send+0x1f>
  1f:	31 c0                	xor    %eax,%eax
  21:	e8 00 00 00 00       	callq  26 <send+0x26>
  26:	89 44 24 04          	mov    %eax,0x4(%rsp)
  2a:	b8 ff ff ff ff       	mov    $0xffffffff,%eax
  2f:	59                   	pop    %rcx
  30:	c3                   	retq  

As noted by JF Bastien here: https://reviews.llvm.org/D54604, "Seems like the optimizer should sink the store enough that it appears only once on each path."
Comment 1 Sanjay Patel 2019-01-30 06:10:41 PST
Isn't this a job for -dse or some other pass in the IR optimizer rather than a backend problem?

define i32 @send(i32 %len) {
entry:
  %err = alloca i32, align 4
  store i32 -1431655766, i32* %err, align 4   <--- dead store 
  %tobool = icmp eq i32 %len, 0
  br i1 %tobool, label %if.end, label %if.then

if.then:                                          
  store i32 -1, i32* %err, align 4
  call void @alloc_sock(i32* nonnull %err) #3
  %call = call i32 (...) @try_send() #3
  store i32 %call, i32* %err, align 4
  br label %if.end

if.end:                                           
  ret i32 -1
}

declare void @alloc_sock(i32*) local_unnamed_addr #2
declare i32 @try_send(...) local_unnamed_addr #2
Comment 2 Alexander Potapenko 2019-01-30 06:14:19 PST
Well, yes. I've illustrated the problem with an assembly excerpt, but this is indeed in IR.
Comment 3 Alexander Potapenko 2019-01-30 06:14:19 PST
Well, yes. I've illustrated the problem with an assembly excerpt, but this is indeed in IR.
Comment 4 Sanjay Patel 2019-01-30 06:23:08 PST
Changing component - not sure if anyone gets auto-cc'd on "scalar optimizations", but it's worth a try. :)
Comment 5 Eli Friedman 2019-01-30 09:40:28 PST
There's been work in the past to do cross-block DSE, but it's currently not in trunk. See https://reviews.llvm.org/D13363 https://reviews.llvm.org/D22568 etc.
Comment 6 JF Bastien 2019-01-30 10:16:49 PST
Apologies, I thought the code had a diamond and we needed to DSE on one side, and not on the other (some min-cut). This missed optimization is even worst than what I'd understood.

For Apple folks, I mirrored this bug in <rdar://problem/47671216>
Comment 7 Eli Friedman 2019-01-30 13:44:28 PST
There's also a patch for partial DSE, which never got merged: https://reviews.llvm.org/D29866 / https://reviews.llvm.org/D29865 .  But the store in the given testcase is fully redundant, yes.

There's also slightly more recent WIP patch for cross-block DSE: https://reviews.llvm.org/D29624 .
Comment 8 Alexander Potapenko 2019-02-12 04:35:48 PST
Added more people from CLs mentioned by Eli.
Folks, do you think reviving these CLs may help here?
Comment 9 Florian Hahn 2020-01-13 16:15:35 PST
FYI I tried to revive MemorySSA backed DSE and broke down the patches in relatively small chunks: https://reviews.llvm.org/D72146 (and linked)

The patch series I posted still does not quite cover this case, but I know what's missing (exploring multiple memorydefs) and that should also be addressed at some point soon.
Comment 10 Florian Hahn 2020-08-14 01:30:12 PDT
MemorySSA backed DSE on trunk can remove the extra stores now (needs -mllvm -enable-dse-memoryssa): https://godbolt.org/z/x5YPsd
Comment 11 David Bolvansky 2020-09-11 06:03:00 PDT
@fhahn, close and commit testcase?
Comment 12 Florian Hahn 2020-09-11 06:23:51 PDT
(In reply to David Bolvansky from comment #11)
> @fhahn, close and commit testcase?

I think there are already plenty of tests that cover the scenario in tree. The unnecessary store is now removed without extra options, but I was planning on waiting with closing the issues a bit to wait until the new default sticks.
Comment 13 Florian Hahn 2020-10-22 06:00:05 PDT
DSE + MemorySSA is the default now (51ff04567b2f8d06b2062bd3ed72eab2e93e4466).

The store gets removed on current trunk without extra options:  https://godbolt.org/z/avq4zz