LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 48640 - Missed optimization for ((X ashr C1) & C2) == 0
Summary: Missed optimization for ((X ashr C1) & C2) == 0
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Scalar Optimizations (show other bugs)
Version: trunk
Hardware: PC Linux
: P enhancement
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-12-31 06:44 PST by David Bolvansky
Modified: 2021-10-18 06:06 PDT (History)
3 users (show)

See Also:
Fixed By Commit(s): 288f3fc5dfee0c51fc00fe10a985f93c505073eb


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Bolvansky 2020-12-31 06:44:34 PST
void foo();

void bad1(int e)
{
   if (((e >> 31) & 64) == 0) foo();
}

void bad2(int e)
{
   if (((e >> 31) & 64) != 0) foo();
}


void ok1(unsigned e)
{
   if (((e >> 31) & 64) == 0) foo();
}

void ok2(unsigned e)
{
   if (((e >> 31) & 64) != 0) foo();
}


LLVM:
bad1(int): # @bad1(int)
  shr edi, 25
  test dil, 64
  jne .LBB0_1
  jmp foo() # TAILCALL
.LBB0_1:
  ret
bad2(int): # @bad2(int)
  shr edi, 25
  test dil, 64
  jne .LBB1_2
  ret
.LBB1_2:
  jmp foo() # TAILCALL
ok1(unsigned int): # @ok1(unsigned int)
  jmp foo() # TAILCALL
ok2(unsigned int): # @ok2(unsigned int)
  ret


GCC:
bad1(int):
        test    edi, edi
        jns     .L4
        ret
.L4:
        jmp     foo()
bad2(int):
        test    edi, edi
        js      .L7
        ret
.L7:
        jmp     foo()
ok1(unsigned int):
        jmp     foo()
ok2(unsigned int):
        ret


https://godbolt.org/z/x85bf6


So..
((X ashr C1) & C2) != 0 to X < 0

..and
((X ashr C1) & C2) == 0 to X >= 0?


define i1 @src(i32 %0) {
%1:
  %2 = ashr i32 %0, 31
  %3 = and i32 %2, 64
  %4 = icmp eq i32 %3, 0
  ret i1 %4
}
=>
define i1 @tgt(i32 %0) {
%1:
  %r = icmp sge i32 %0, 0
  ret i1 %r
}
Transformation seems to be correct!


define i1 @src(i32 %0) {
%1:
  %2 = ashr i32 %0, 31
  %3 = and i32 %2, 64
  %4 = icmp ne i32 %3, 0
  ret i1 %4
}
=>
define i1 @tgt(i32 %0) {
%1:
  %r = icmp slt i32 %0, 0
  ret i1 %r
}
Transformation seems to be correct!
Comment 1 Sanjay Patel 2021-01-01 10:12:39 PST
The interesting part of this one is how to generalize the pre-conditions on the constants. I think what we're asking is: "Is the mask constant only keeping bits that are guaranteed to be the same as the sign bit?"

There might be a better way to write that, but I came up with this:
https://rise4fun.com/Alive/Z0UF

Name: eq
Pre: C2 != 0 && (~(-1 u>> C1) & C2) == C2
 %s = ashr i32 %x, C1
 %a = and i32 %s, C2
 %r = icmp eq i32 %a, 0
 =>
 %r = icmp sge i32 %x, 0
Comment 2 David Bolvansky 2021-01-01 11:32:19 PST
gcc supports this optimization with C1 as a valid shift count and C2 power of two.

“If (C2 << C1) doesn't overflow, then ((X >> C1) & C2) != 0
can be rewritten as (X & (C2 << C1)) != 0.” 

Else (signed case): see above.


And it seems LLVM handles the case with no overflow.
Comment 3 Sanjay Patel 2021-01-04 10:26:42 PST
We miss even more basic patterns (without the 'and' mask):
https://reviews.llvm.org/D94014
Comment 4 Simon Pilgrim 2021-10-17 08:11:58 PDT
(In reply to Sanjay Patel from comment #3)
> We miss even more basic patterns (without the 'and' mask):
> https://reviews.llvm.org/D94014

This part landed at rG288f3fc5dfee

Not sure if there were any specific followups but the code in trunk looks to be fixed.

@spatel - OK to resolve? (and do you recall if any other patches were necessary?)
Comment 5 Sanjay Patel 2021-10-18 06:06:57 PDT
From the debug spew of instcombine, we now canonicalize the ashr+and to icmp+select, so that's the very recent:
https://reviews.llvm.org/D111410

We might've have managed to reduce it through some other combination of folds before last week, but either way, the examples here are now fixed.