[InstCombine] Incorrect folding of LShr into select instruction #47697

CongzheUalberta · 2020-12-01T22:16:31Z


Bugzilla Link	48353
Version	trunk
OS	Linux
CC	@aqjune,@LebedevRI,@RKSimon,@meheff,@nunoplopes,@regehr,@rotateright

Extended Description

The IR that exposes this bug is as follows.

; ModuleID = 'test.c'

@m = common dso_local local_unnamed_addr global i32 0, align 4

define dso_local i32 @test() local_unnamed_addr #1 {
entry:
call fastcc void @foo(i32 51)
ret i32 0
}

define internal fastcc void @foo(i32 %aj) {
entry:
%cmp.i = icmp sgt i32 %aj, 1
%shr.i = select i1 %cmp.i, i32 0, i32 %aj
%cond.i1 = lshr i32 3, %shr.i
%conv.i = trunc i32 %cond.i1 to i8
%0 = and i8 %conv.i, 1
%cmp4 = icmp ugt i8 %0, 0
%conv5 = zext i1 %cmp4 to i32
store i32 %conv5, i32* @m
ret void
}

Note that at compile time we know the argument of foo() is %aj = 51 thus "%shr.i = select i1 %cmp.i, i32 0, i32 %aj" will result in 0 and hence LShr instruction that follows is correct, not resulting in a poison value.

When running "opt -instcombine test.ll -S", the following IR is generated. Specifically, "%aj.op = lshr i32 3, %aj" is generated which result in a poison value since %aj is actually greater than 32.

@m = common dso_local local_unnamed_addr global i32 0, align 4

define dso_local i32 @test() local_unnamed_addr #1 {
entry:
call fastcc void @foo(i32 51)
ret i32 0
}

define internal fastcc void @o(i32 %aj) {
entry:
%cmp.i = icmp sgt i32 %aj, 1
%aj.op = lshr i32 3, %aj
%.op2 = and i32 %aj.op, 1
%cmp.i3 = zext i1 %cmp.i to i32
%conv55 = or i32 %.op2, %cmp.i3
store i32 %conv55, i32* @m, align 4
ret void
}

The cause is that during InstCombine, we fold

%shr.i = select i1 %cmp.i, i32 0, i32 %aj
%cond.i1 = lshr i32 3, %shr.i

into

%aj.op = lshr i32 3, %aj
%cond.i1 = select i1 %cmp.i, i32 3, i32 %aj.op,

thus generating the incorrect LShr instruction "%aj.op = lshr i32 3, %aj".

Our proposed fix:

In function InstCombiner::FoldOpIntoSelect() from Transforms/InstCombine/InstructionCombining.cpp, add the following check such that we only fold LShr into select when both the true value and the false value are known to be constants.

// If op is a LShr instruction, we can only fold into select when both TV
// and FV are known to be constants, otherwise they may be evaluated to be
// greater than the width of the first operand of LShr, resulting in for
// example:
// LShr i32 X Y where Y > 32,
// which is not suppose to be generated.
if (Op.getOpcode() == Instruction::LShr &&
!(isa(TV) && isa(FV))) {
return nullptr;
}

LebedevRI · 2020-12-02T10:27:55Z

So the transform that is causing miscompile is

define i32 @src(i32 %aj) {
%entry:
%cmp.i = icmp sgt i32 %aj, 1
%aj.op = lshr i32 3, %aj
%.op2 = and i32 %aj.op, 1
%cmp41 = icmp ne i32 %.op2, 0
%cmp4 = select i1 %cmp.i, i1 1, i1 %cmp41
%conv5 = zext i1 %cmp4 to i32
ret i32 %conv5
}
=>
define i32 @tgt(i32 %aj) {
%entry:
%cmp.i = icmp sgt i32 %aj, 1
%aj.op = lshr i32 3, %aj
%.op2 = and i32 %aj.op, 1
%cmp41 = icmp ne i32 %.op2, 0
%cmp4 = or i1 %cmp.i, %cmp41
%conv5 = zext i1 %cmp4 to i32
ret i32 %conv5
}
Transformation doesn't verify!
ERROR: Target is more poisonous than source

Example:
i32 %aj = #x00000020 (32)

Source:
i1 %cmp.i = #x1 (1)
i32 %aj.op = poison
i32 %.op2 = poison
i1 %cmp41 = poison
i1 %cmp4 = #x1 (1)
i32 %conv5 = #x00000001 (1)

Target:
i1 %cmp.i = #x1 (1)
i32 %aj.op = poison
i32 %.op2 = poison
i1 %cmp41 = poison
i1 %cmp4 = poison
i32 %conv5 = poison
Source value: #x00000001 (1)
Target value: poison

I.e.

define i1 @src(i1 %cmp.i, i1 %cmp41) {
%entry:
%cmp4 = select i1 %cmp.i, i1 1, i1 %cmp41
ret i1 %cmp4
}
=>
define i1 @tgt(i1 %cmp.i, i1 %cmp41) {
%entry:
%cmp4 = or i1 %cmp.i, %cmp41
ret i1 %cmp4
}
Transformation doesn't verify!
ERROR: Target is more poisonous than source

Example:
i1 %cmp.i = #x1 (1)
i1 %cmp41 = poison

Source:
i1 %cmp4 = #x1 (1)

Target:
i1 %cmp4 = poison
Source value: #x1 (1)
Target value: poison

LebedevRI · 2020-12-02T11:14:40Z

... time to apply freeze?

aqjune · 2020-12-02T12:25:53Z

Would it be a big change if optimizations that deal with or/and are updated to deal with the corresponding select patterns?

nunoplopes · 2020-12-02T12:29:00Z

I tend to agree. I don't see much value in turning select -> and/or + freeze. I would rather keep the select and make sure the backends are happy with it.
I think Sanjay already did some work in SelDAG to make it like selects more AFAIR.

rotateright · 2020-12-02T13:48:04Z

Yes, we've done some work in codegen to make it more robust with select in IR. We've also tried to patch instcombine to make it more flexible about recognizing patterns with selects.

If I'm seeing it correctly, this is one of the transforms noted here:
https://reviews.llvm.org/D72396

So we need to remove this block:

llvm-project/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp

Line 2627 in f03c21d

if (SelType->isIntOrIntVectorTy(1) &&

And see what regresses?

nunoplopes · 2020-12-02T14:13:50Z

So we need to remove this block:
https://github.com/llvm/llvm-project/blob/
f03c21d/llvm/lib/Transforms/InstCombine/
InstCombineSelect.cpp#L2627

And see what regresses?

Sounds like a plan! :)
(that whole block seems incorrect, yes)

CongzheUalberta · 2020-12-02T22:19:23Z

Yes, we've done some work in codegen to make it more robust with select in
IR. We've also tried to patch instcombine to make it more flexible about
recognizing patterns with selects.

If I'm seeing it correctly, this is one of the transforms noted here:
https://reviews.llvm.org/D72396

So we need to remove this block:
https://github.com/llvm/llvm-project/blob/
f03c21d/llvm/lib/Transforms/InstCombine/
InstCombineSelect.cpp#L2627

And see what regresses?

Thanks for the suggestion!

Yes, we've done some work in codegen to make it more robust with select in
IR. We've also tried to patch instcombine to make it more flexible about
recognizing patterns with selects.

If I'm seeing it correctly, this is one of the transforms noted here:
https://reviews.llvm.org/D72396

So we need to remove this block:
https://github.com/llvm/llvm-project/blob/
f03c21d/llvm/lib/Transforms/InstCombine/
InstCombineSelect.cpp#L2627

And see what regresses?

Thanks for the suggestion! Are you going to upstream it? We are going to internally implement a patch as suggested and measure the performance anyways, so we could just post the bug/solution/result to Phabricator if that makes things easier. I'd appreciate your comment on it.

CongzheUalberta · 2020-12-03T05:40:08Z

I tend to agree. I don't see much value in turning select -> and/or +
freeze. I would rather keep the select and make sure the backends are happy
with it.
I think Sanjay already did some work in SelDAG to make it like selects more
AFAIR.

Thanks for the comment. I'm just curious what would be the potential solution of "turning select -> and/or + freeze", what does it look like? Is it like just freezing the result of and/or instruction?

aqjune · 2020-12-03T05:56:52Z

The solution that uses freeze is to insert freeze at the operand or and/or coming from non-conditional value of select:

select x, true, y -> or x, freeze(y)
select x, y, false -> and x, freeze(y)

aqjune · 2020-12-03T06:23:48Z

Here is the test: https://alive2.llvm.org/ce/z/dwFA_n

CongzheUalberta · 2020-12-03T14:31:29Z

Here is the test: https://alive2.llvm.org/ce/z/dwFA_n

Thanks for the demo! it makes perfect sense.

rotateright · 2020-12-08T17:10:57Z

Yes, we've done some work in codegen to make it more robust with select in
IR. We've also tried to patch instcombine to make it more flexible about
recognizing patterns with selects.

If I'm seeing it correctly, this is one of the transforms noted here:
https://reviews.llvm.org/D72396

So we need to remove this block:
https://github.com/llvm/llvm-project/blob/
f03c21d/llvm/lib/Transforms/InstCombine/
InstCombineSelect.cpp#L2627

And see what regresses?

Thanks for the suggestion!

Yes, we've done some work in codegen to make it more robust with select in
IR. We've also tried to patch instcombine to make it more flexible about
recognizing patterns with selects.

If I'm seeing it correctly, this is one of the transforms noted here:
https://reviews.llvm.org/D72396

So we need to remove this block:
https://github.com/llvm/llvm-project/blob/
f03c21d/llvm/lib/Transforms/InstCombine/
InstCombineSelect.cpp#L2627

And see what regresses?

Thanks for the suggestion! Are you going to upstream it? We are going to
internally implement a patch as suggested and measure the performance
anyways, so we could just post the bug/solution/result to Phabricator if
that makes things easier. I'd appreciate your comment on it.

Sorry, I missed that comment initially. If you can collect some perf measurements with that change and/or post to Phab, that would be great.

I just looked at some of the instcombine regression test diffs, and we may need to accelerate the canonicalization to the min/max intrinsics to avoid problems. But we should check codegen for various targets to see if those are really regressions that we must address.

CongzheUalberta · 2020-12-10T20:22:58Z

Yes, we've done some work in codegen to make it more robust with select in
IR. We've also tried to patch instcombine to make it more flexible about
recognizing patterns with selects.

If I'm seeing it correctly, this is one of the transforms noted here:
https://reviews.llvm.org/D72396

So we need to remove this block:
https://github.com/llvm/llvm-project/blob/
f03c21d/llvm/lib/Transforms/InstCombine/
InstCombineSelect.cpp#L2627

And see what regresses?

Thanks for the suggestion!

Yes, we've done some work in codegen to make it more robust with select in
IR. We've also tried to patch instcombine to make it more flexible about
recognizing patterns with selects.

If I'm seeing it correctly, this is one of the transforms noted here:
https://reviews.llvm.org/D72396

So we need to remove this block:
https://github.com/llvm/llvm-project/blob/
f03c21d/llvm/lib/Transforms/InstCombine/
InstCombineSelect.cpp#L2627

And see what regresses?

Thanks for the suggestion! Are you going to upstream it? We are going to
internally implement a patch as suggested and measure the performance
anyways, so we could just post the bug/solution/result to Phabricator if
that makes things easier. I'd appreciate your comment on it.

Sorry, I missed that comment initially. If you can collect some perf
measurements with that change and/or post to Phab, that would be great.

I just looked at some of the instcombine regression test diffs, and we may
need to accelerate the canonicalization to the min/max intrinsics to avoid
problems. But we should check codegen for various targets to see if those
are really regressions that we must address.

Thanks for the reply! I followed the suggestion that removes this block:

llvm-project/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp

Line 2627 in f03c21d

if (SelType->isIntOrIntVectorTy(1) &&

.

But then the built clang crashes when building some tests. The cause is that, since that block has been removed, when we now process instructions like "select i1 Condition, i1 TrueValue, i1 FalseValue", we'll enter the following code block to optimize it:

llvm-project/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp

Line 2669 in f03c21d

if (SelType->isIntOrIntVectorTy() &&

And this results in "zext i1 to i1", causing the compiler to crash.

My fix to the crash is that, in the condition checking (line 2669), adding the following checks to make sure TrueVal and FalseVal are not i1 type, which now looks like:

if (SelType->isIntOrIntVectorTy() &&
CondVal->getType()->isVectorTy() == SelType->isVectorTy() &&
(TrueVal->getType()->isIntegerTy() &&
TrueVal->getType()->getIntegerBitWidth() != 1) &&
(FalseVal->getType()->isIntegerTy() &&
FalseVal->getType()->getIntegerBitWidth() != 1))

I'm wondering if you have comments on this fix? If it looks fine I'll go ahead measuring the performance. Thanks!

CongzheUalberta · 2020-12-10T21:37:42Z

Posted the fix suggested to Phabricator as work-in-progress, will update the progress and performance numbers there. Any review is welcome, please just let me know if I happen to miss someone.

https://reviews.llvm.org/D93065

CongzheUalberta · 2020-12-22T20:07:02Z

*** Bug llvm/llvm-bugzilla-archive#48435 has been marked as a duplicate of this bug. ***

CongzheUalberta · 2021-11-27T03:52:20Z

mentioned in issue llvm/llvm-bugzilla-archive#48435

nikic · 2022-05-17T13:05:00Z

The logical and/or changes have since landed.

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021

nikic closed this as completed May 17, 2022

EugeneZelenko added the llvm:instcombine label May 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[InstCombine] Incorrect folding of LShr into select instruction #47697

[InstCombine] Incorrect folding of LShr into select instruction #47697

CongzheUalberta commented Dec 1, 2020

LebedevRI commented Dec 2, 2020

LebedevRI commented Dec 2, 2020

aqjune commented Dec 2, 2020

nunoplopes commented Dec 2, 2020

rotateright commented Dec 2, 2020

nunoplopes commented Dec 2, 2020

CongzheUalberta commented Dec 2, 2020

CongzheUalberta commented Dec 3, 2020

aqjune commented Dec 3, 2020

aqjune commented Dec 3, 2020

CongzheUalberta commented Dec 3, 2020

rotateright commented Dec 8, 2020

CongzheUalberta commented Dec 10, 2020

CongzheUalberta commented Dec 10, 2020

CongzheUalberta commented Dec 22, 2020

CongzheUalberta commented Nov 27, 2021

nikic commented May 17, 2022

[InstCombine] Incorrect folding of LShr into select instruction #47697

[InstCombine] Incorrect folding of LShr into select instruction #47697

Comments

CongzheUalberta commented Dec 1, 2020

Extended Description

LebedevRI commented Dec 2, 2020

LebedevRI commented Dec 2, 2020

aqjune commented Dec 2, 2020

nunoplopes commented Dec 2, 2020

rotateright commented Dec 2, 2020

nunoplopes commented Dec 2, 2020

CongzheUalberta commented Dec 2, 2020

CongzheUalberta commented Dec 3, 2020

aqjune commented Dec 3, 2020

aqjune commented Dec 3, 2020

CongzheUalberta commented Dec 3, 2020

rotateright commented Dec 8, 2020

CongzheUalberta commented Dec 10, 2020

CongzheUalberta commented Dec 10, 2020

CongzheUalberta commented Dec 22, 2020

CongzheUalberta commented Nov 27, 2021

nikic commented May 17, 2022