50816 – Missed opportunity to simplify bitselect pattern

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 50816 - Missed opportunity to simplify bitselect pattern

Summary: Missed opportunity to simplify bitselect pattern

Status:	RESOLVED FIXED

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Scalar Optimizations (show other bugs)
Version:	trunk
Hardware:	PC Windows NT

Importance:	P enhancement
Assignee:	Unassigned LLVM Bugs

URL:
Keywords:

Depends on:
Blocks:

Reported:	2021-06-23 05:44 PDT by Simon Pilgrim
Modified:	2021-07-13 06:55 PDT (History)
CC List:	3 users (show)

See Also:
Fixed By Commit(s):	a488c7879e68 b2f6cf14798a

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Simon Pilgrim 2021-06-23 05:44:18 PDT

Pulled from bullet source code:

unsigned btSelect(unsigned condition, unsigned valueIfConditionNonZero, unsigned valueIfConditionZero) 
{
    // Set testNz to 0xFFFFFFFF if condition is nonzero, 0x00000000 if condition is zero
    // Rely on positive value or'ed with its negative having sign bit on
    // and zero value or'ed with its negative (which is still zero) having sign bit off 
    // Use arithmetic shift right, shifting the sign bit through all 32 bits
    unsigned testNz = (unsigned)(((int)condition | -(int)condition) >> 31);
    unsigned testEqz = ~testNz;
    return ((valueIfConditionNonZero & testNz) | (valueIfConditionZero & testEqz)); 
}

https://c.godbolt.org/z/rMf6Ta613

The following code doesn't get simplified:

define i8 @src(i8 %a0, i8 %a1, i8 %a2) {
  %add = add i8 %a0, -1
  %xor = xor i8 %a0, -1
  %and = and i8 %add, %xor
  %cmp = icmp sgt i8 %and, -1
  %sel = select i1 %cmp, i8 %a1, i8 %a2
  ret i8 %sel
}

define i8 @tgt(i8 %a0, i8 %a1, i8 %a2) {
  %cmp = icmp ne i8 %a0, 0
  %sel = select i1 %cmp, i8 %a1, i8 %a2
  ret i8 %sel
}

Transformation seems to be correct!

Comment 1 Simon Pilgrim 2021-07-09 10:00:57 PDT

We don't get the simpler variant either:

define i32 @src(i32 %0) {
%1:
  %2 = sub nsw i32 0, %0
  %3 = or i32 %2, %0
  %4 = ashr i32 %3, 31
  ret i32 %4
}
=>
define i32 @tgt(i32 %0) {
%1:
  %2 = icmp ne i32 %0, 0
  %3 = sext i1 %2 to i32
  ret i32 %3
}
Transformation seems to be correct!

Comment 2 Simon Pilgrim 2021-07-10 13:54:54 PDT

Candidate Patch: https://reviews.llvm.org/D105764

Comment 3 Sanjay Patel 2021-07-12 06:18:07 PDT

(In reply to Simon Pilgrim from comment #2)
> Candidate Patch: https://reviews.llvm.org/D105764

I didn't check if that patch would solve the motivating source example, but I think we need the icmp fold(s) anyway, so:
https://reviews.llvm.org/rGa488c7879e68

Comment 4 Simon Pilgrim 2021-07-13 06:55:21 PDT

(In reply to Simon Pilgrim from comment #2)
> Candidate Patch: https://reviews.llvm.org/D105764

Committed at rGb2f6cf14798a