Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed opportunity to remove a dead store #39873

Closed
ramosian-glider opened this issue Jan 30, 2019 · 13 comments
Closed

Missed opportunity to remove a dead store #39873

ramosian-glider opened this issue Jan 30, 2019 · 13 comments
Labels
bugzilla Issues migrated from bugzilla

Comments

@ramosian-glider
Copy link
Contributor

Bugzilla Link 40527
Resolution FIXED
Resolved on Oct 22, 2020 06:00
Version trunk
OS Linux
CC @topperc,@davidbolvansky,@KernelAddress,@efriedma-quic,@fhahn,@kcc,@bryant,@jfbastien,@RKSimon,@rjmccall,@rotateright,@vitalybuka
Fixed by commit(s) 51ff045

Extended Description

When compiling the following code:

============================
void alloc_sock(int *err);
int try_send();

int send(int len) {
int err;
if (len) {
err = -1;
alloc_sock(&err);
err = try_send();
}
return -1;
}

with
$ clang -O2 -g -ftrivial-auto-var-init=pattern -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang -c t.c

Clang produces an extra store of $0xaaaaaaaa to |err|, which is introduced by the -ftrivial-auto-var-init and not removed by other optimizations in the pipeline:

$ objdump -d t.o

0000000000000000 :
0: 50 push %rax
1: c7 44 24 04 aa aa aa movl $0xaaaaaaaa,0x4(%rsp)
8: aa
9: 85 ff test %edi,%edi
b: 74 1d je 2a <send+0x2a>
d: c7 44 24 04 ff ff ff movl $0xffffffff,0x4(%rsp)
14: ff
15: 48 8d 7c 24 04 lea 0x4(%rsp),%rdi
1a: e8 00 00 00 00 callq 1f <send+0x1f>
1f: 31 c0 xor %eax,%eax
21: e8 00 00 00 00 callq 26 <send+0x26>
26: 89 44 24 04 mov %eax,0x4(%rsp)
2a: b8 ff ff ff ff mov $0xffffffff,%eax
2f: 59 pop %rcx
30: c3 retq

As noted by JF Bastien here: https://reviews.llvm.org/D54604, "Seems like the optimizer should sink the store enough that it appears only once on each path."

@rotateright
Copy link
Contributor

Isn't this a job for -dse or some other pass in the IR optimizer rather than a backend problem?

define i32 @​send(i32 %len) {
entry:
%err = alloca i32, align 4
store i32 -1431655766, i32* %err, align 4 <--- dead store
%tobool = icmp eq i32 %len, 0
br i1 %tobool, label %if.end, label %if.then

if.then:
store i32 -1, i32* %err, align 4
call void @​alloc_sock(i32* nonnull %err) #​3
%call = call i32 (...) @​try_send() #​3
store i32 %call, i32* %err, align 4
br label %if.end

if.end:
ret i32 -1
}

declare void @​alloc_sock(i32*) local_unnamed_addr #​2
declare i32 @​try_send(...) local_unnamed_addr #​2

@ramosian-glider
Copy link
Contributor Author

Well, yes. I've illustrated the problem with an assembly excerpt, but this is indeed in IR.

1 similar comment
@ramosian-glider
Copy link
Contributor Author

Well, yes. I've illustrated the problem with an assembly excerpt, but this is indeed in IR.

@rotateright
Copy link
Contributor

Changing component - not sure if anyone gets auto-cc'd on "scalar optimizations", but it's worth a try. :)

@efriedma-quic
Copy link
Collaborator

There's been work in the past to do cross-block DSE, but it's currently not in trunk. See https://reviews.llvm.org/D13363 https://reviews.llvm.org/D22568 etc.

@jfbastien
Copy link
Member

Apologies, I thought the code had a diamond and we needed to DSE on one side, and not on the other (some min-cut). This missed optimization is even worst than what I'd understood.

For Apple folks, I mirrored this bug in rdar://problem/47671216

@efriedma-quic
Copy link
Collaborator

There's also a patch for partial DSE, which never got merged: https://reviews.llvm.org/D29866 / https://reviews.llvm.org/D29865 . But the store in the given testcase is fully redundant, yes.

There's also slightly more recent WIP patch for cross-block DSE: https://reviews.llvm.org/D29624 .

@ramosian-glider
Copy link
Contributor Author

Added more people from CLs mentioned by Eli.
Folks, do you think reviving these CLs may help here?

@fhahn
Copy link
Contributor

fhahn commented Jan 14, 2020

FYI I tried to revive MemorySSA backed DSE and broke down the patches in relatively small chunks: https://reviews.llvm.org/D72146 (and linked)

The patch series I posted still does not quite cover this case, but I know what's missing (exploring multiple memorydefs) and that should also be addressed at some point soon.

@fhahn
Copy link
Contributor

fhahn commented Aug 14, 2020

MemorySSA backed DSE on trunk can remove the extra stores now (needs -mllvm -enable-dse-memoryssa): https://godbolt.org/z/x5YPsd

@davidbolvansky
Copy link
Collaborator

@​fhahn, close and commit testcase?

@fhahn
Copy link
Contributor

fhahn commented Sep 11, 2020

@​fhahn, close and commit testcase?

I think there are already plenty of tests that cover the scenario in tree. The unnecessary store is now removed without extra options, but I was planning on waiting with closing the issues a bit to wait until the new default sticks.

@fhahn
Copy link
Contributor

fhahn commented Oct 22, 2020

DSE + MemorySSA is the default now (51ff045).

The store gets removed on current trunk without extra options: https://godbolt.org/z/avq4zz

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

6 participants