Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect optimization of gep without inbounds + load -> icmp eq #44555

Closed
nunoplopes opened this issue Mar 15, 2020 · 10 comments
Closed

Incorrect optimization of gep without inbounds + load -> icmp eq #44555

nunoplopes opened this issue Mar 15, 2020 · 10 comments
Labels
bugzilla Issues migrated from bugzilla miscompilation

Comments

@nunoplopes
Copy link
Member

Bugzilla Link 45210
Resolution FIXED
Resolved on May 31, 2021 10:24
Version trunk
OS All
CC @efriedma-quic,@aqjune,@LebedevRI,@regehr,@rotateright

Extended Description

The following unit test in "Transforms/InstCombine/load-cmp.ll" exposes an incorrect optimization:

define i1 @​test1_noinbounds(i32 %X) {
; CHECK-LABEL: @​test1_noinbounds(
; CHECK-NEXT: [[R:%.]] = icmp eq i32 [[X:%.]], 9
; CHECK-NEXT: ret i1 [[R]]
;
%P = getelementptr [10 x i16], [10 x i16]* @​G16, i32 0, i32 %X
%Q = load i16, i16* %P
%R = icmp eq i16 %Q, 0
ret i1 %R
}

Output of Alive2. TL;DR: the optimization is only correct with gep inbounds, otherwise the transformation to "icmp eq" misses the overflow case.

@​G16 = constant 20 bytes, align 16

define i1 @​test1_noinbounds(i32 %X) {
#init:
store [10 x i16] { 35, 82, 69, 81, 85, 73, 82, 69, 68, 0 }, * @​G16, align 16
br label %0

%0:
%P = gep * @​G16, 20 x i32 0, 2 x i32 %X
%Q = load i16, * %P, align 2
%R = icmp eq i16 %Q, 0
ret i1 %R
}
=>
@​G16 = constant 20 bytes, align 16

define i1 @​test1_noinbounds(i32 %X) {
#init:
store [10 x i16] { 35, 82, 69, 81, 85, 73, 82, 69, 68, 0 }, * @​G16, align 16
br label %0

%0:
%R = icmp eq i32 %X, 9
ret i1 %R
}

Transformation doesn't verify!
ERROR: Value mismatch

Example:
i32 %X = #x80000009 (2147483657, -2147483639)

Source:

  • %P = pointer(non-local, block_id=1, offset=18)
    i16 %Q = #x0000 (0)
    i1 %R = #x1 (1)

Target:
i1 %R = #x0 (0)
Source value: #x1 (1)
Target value: #x0 (0)

@rotateright
Copy link
Contributor

Not sure if I'm using it correctly, but the online instance of Alive2 does not show this as invalid?
http://volta.cs.utah.edu:8080/z/UutRJe

@rotateright
Copy link
Contributor

The quick fix:

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index 63bf3462faa..c4d7a956025 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -159,7 +159,7 @@ Instruction *InstCombiner::foldCmpLoadFromIndexedGlobal(GetElementPtrInst *GEP,
// the simple index into a single-dimensional array.
//
// Require: GEP GV, 0, i {{, constant indices}}

  • if (GEP->getNumOperands() < 3 ||
  • if (!GEP->isInBounds() || GEP->getNumOperands() < 3 ||
    !isa(GEP->getOperand(1)) ||
    !cast(GEP->getOperand(1))->isZero() ||
    isa(GEP->getOperand(2)))

But that causes 5 other test changes (no more optimization) in that same regression test file.

I think the tests were meant to exercise various possibilities for that index variable such as "what if its size is too small to cause an overflow?"

That is, do we need to use the datalayout in addition to 'inbounds' to decide how to limit this transform?

@nunoplopes
Copy link
Member Author

Not sure if I'm using it correctly, but the online instance of Alive2 does
not show this as invalid?
http://volta.cs.utah.edu:8080/z/UutRJe

You're doing it right; I just think the online version is a bit old.
John, could you check please?

@regehr
Copy link
Contributor

regehr commented Mar 17, 2020

I just updated both LLVM and Alive2. The example is still reading as correct, and I have made sure I'm not seeing cached results.

@efriedma-quic
Copy link
Collaborator

That is, do we need to use the datalayout in addition to 'inbounds' to
decide how to limit this transform?

For the given testcase, we only produce the wrong result for precisely x == 0x80000009. We could mask off the high bit to produce the correct result, I think. The general form of the transform is something like "(x & (-1 >> TrailingBits(sizeof(IndexedType)))) == n".

@aqjune
Copy link
Contributor

aqjune commented Mar 18, 2020

I could reproduce this with 64 bits offset:
http://volta.cs.utah.edu:8080/z/x-9fib

Alive2 says the offset can be truncated and compared as well:
http://volta.cs.utah.edu:8080/z/3ABHi3

@nunoplopes
Copy link
Member Author

Right, both the target data & sizeof(type) are important because the bug is triggered as Eli say when sext(idx) * sizeof(type) overflows.

It seems there's a bug in Alive2 and/or the online tool ATM, sorry. We are looking into it. The example I posted is valid though.

@regehr
Copy link
Contributor

regehr commented Mar 18, 2020

Alive2 + CE problem was my fault. Should be fixed now, sorry about that.

@efriedma-quic
Copy link
Collaborator

Alive2 link: https://alive2.llvm.org/ce/z/bGG9Vy

@nunoplopes
Copy link
Member Author

Fixed: https://reviews.llvm.org/D99481

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla miscompilation
Projects
None yet
Development

No branches or pull requests

5 participants