Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed optimization for ((X ashr C1) & C2) == 0 #47984

Closed
davidbolvansky opened this issue Dec 31, 2020 · 5 comments
Closed

Missed optimization for ((X ashr C1) & C2) == 0 #47984

davidbolvansky opened this issue Dec 31, 2020 · 5 comments
Labels
bugzilla Issues migrated from bugzilla

Comments

@davidbolvansky
Copy link
Collaborator

Bugzilla Link 48640
Resolution FIXED
Resolved on Oct 18, 2021 06:06
Version trunk
OS Linux
CC @RKSimon,@rotateright
Fixed by commit(s) 288f3fc

Extended Description

void foo();

void bad1(int e)
{
if (((e >> 31) & 64) == 0) foo();
}

void bad2(int e)
{
if (((e >> 31) & 64) != 0) foo();
}

void ok1(unsigned e)
{
if (((e >> 31) & 64) == 0) foo();
}

void ok2(unsigned e)
{
if (((e >> 31) & 64) != 0) foo();
}

LLVM:
bad1(int): # @​bad1(int)
shr edi, 25
test dil, 64
jne .LBB0_1
jmp foo() # TAILCALL
.LBB0_1:
ret
bad2(int): # @​bad2(int)
shr edi, 25
test dil, 64
jne .LBB1_2
ret
.LBB1_2:
jmp foo() # TAILCALL
ok1(unsigned int): # @​ok1(unsigned int)
jmp foo() # TAILCALL
ok2(unsigned int): # @​ok2(unsigned int)
ret

GCC:
bad1(int):
test edi, edi
jns .L4
ret
.L4:
jmp foo()
bad2(int):
test edi, edi
js .L7
ret
.L7:
jmp foo()
ok1(unsigned int):
jmp foo()
ok2(unsigned int):
ret

https://godbolt.org/z/x85bf6

So..
((X ashr C1) & C2) != 0 to X < 0

..and
((X ashr C1) & C2) == 0 to X >= 0?

define i1 @​src(i32 %0) {
%1:
%2 = ashr i32 %0, 31
%3 = and i32 %2, 64
%4 = icmp eq i32 %3, 0
ret i1 %4
}
=>
define i1 @​tgt(i32 %0) {
%1:
%r = icmp sge i32 %0, 0
ret i1 %r
}
Transformation seems to be correct!

define i1 @​src(i32 %0) {
%1:
%2 = ashr i32 %0, 31
%3 = and i32 %2, 64
%4 = icmp ne i32 %3, 0
ret i1 %4
}
=>
define i1 @​tgt(i32 %0) {
%1:
%r = icmp slt i32 %0, 0
ret i1 %r
}
Transformation seems to be correct!

@rotateright
Copy link
Contributor

The interesting part of this one is how to generalize the pre-conditions on the constants. I think what we're asking is: "Is the mask constant only keeping bits that are guaranteed to be the same as the sign bit?"

There might be a better way to write that, but I came up with this:
https://rise4fun.com/Alive/Z0UF

Name: eq
Pre: C2 != 0 && (~(-1 u>> C1) & C2) == C2
%s = ashr i32 %x, C1
%a = and i32 %s, C2
%r = icmp eq i32 %a, 0
=>
%r = icmp sge i32 %x, 0

@davidbolvansky
Copy link
Collaborator Author

gcc supports this optimization with C1 as a valid shift count and C2 power of two.

“If (C2 << C1) doesn't overflow, then ((X >> C1) & C2) != 0
can be rewritten as (X & (C2 << C1)) != 0.”

Else (signed case): see above.

And it seems LLVM handles the case with no overflow.

@rotateright
Copy link
Contributor

We miss even more basic patterns (without the 'and' mask):
https://reviews.llvm.org/D94014

@RKSimon
Copy link
Collaborator

RKSimon commented Oct 17, 2021

We miss even more basic patterns (without the 'and' mask):
https://reviews.llvm.org/D94014

This part landed at rG288f3fc5dfee

Not sure if there were any specific followups but the code in trunk looks to be fixed.

@​spatel - OK to resolve? (and do you recall if any other patches were necessary?)

@rotateright
Copy link
Contributor

From the debug spew of instcombine, we now canonicalize the ashr+and to icmp+select, so that's the very recent:
https://reviews.llvm.org/D111410

We might've have managed to reduce it through some other combination of folds before last week, but either way, the examples here are now fixed.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

3 participants