The following unit test in "Transforms/InstCombine/load-cmp.ll" exposes an incorrect optimization: define i1 @test1_noinbounds(i32 %X) { ; CHECK-LABEL: @test1_noinbounds( ; CHECK-NEXT: [[R:%.*]] = icmp eq i32 [[X:%.*]], 9 ; CHECK-NEXT: ret i1 [[R]] ; %P = getelementptr [10 x i16], [10 x i16]* @G16, i32 0, i32 %X %Q = load i16, i16* %P %R = icmp eq i16 %Q, 0 ret i1 %R } Output of Alive2. TL;DR: the optimization is only correct with gep inbounds, otherwise the transformation to "icmp eq" misses the overflow case. @G16 = constant 20 bytes, align 16 define i1 @test1_noinbounds(i32 %X) { #init: store [10 x i16] { 35, 82, 69, 81, 85, 73, 82, 69, 68, 0 }, * @G16, align 16 br label %0 %0: %P = gep * @G16, 20 x i32 0, 2 x i32 %X %Q = load i16, * %P, align 2 %R = icmp eq i16 %Q, 0 ret i1 %R } => @G16 = constant 20 bytes, align 16 define i1 @test1_noinbounds(i32 %X) { #init: store [10 x i16] { 35, 82, 69, 81, 85, 73, 82, 69, 68, 0 }, * @G16, align 16 br label %0 %0: %R = icmp eq i32 %X, 9 ret i1 %R } Transformation doesn't verify! ERROR: Value mismatch Example: i32 %X = #x80000009 (2147483657, -2147483639) Source: * %P = pointer(non-local, block_id=1, offset=18) i16 %Q = #x0000 (0) i1 %R = #x1 (1) Target: i1 %R = #x0 (0) Source value: #x1 (1) Target value: #x0 (0)
Not sure if I'm using it correctly, but the online instance of Alive2 does not show this as invalid? http://volta.cs.utah.edu:8080/z/UutRJe
The quick fix: diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp index 63bf3462faa..c4d7a956025 100644 --- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp +++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp @@ -159,7 +159,7 @@ Instruction *InstCombiner::foldCmpLoadFromIndexedGlobal(GetElementPtrInst *GEP, // the simple index into a single-dimensional array. // // Require: GEP GV, 0, i {{, constant indices}} - if (GEP->getNumOperands() < 3 || + if (!GEP->isInBounds() || GEP->getNumOperands() < 3 || !isa<ConstantInt>(GEP->getOperand(1)) || !cast<ConstantInt>(GEP->getOperand(1))->isZero() || isa<Constant>(GEP->getOperand(2))) But that causes 5 other test changes (no more optimization) in that same regression test file. I think the tests were meant to exercise various possibilities for that index variable such as "what if its size is too small to cause an overflow?" That is, do we need to use the datalayout in addition to 'inbounds' to decide how to limit this transform?
(In reply to Sanjay Patel from comment #1) > Not sure if I'm using it correctly, but the online instance of Alive2 does > not show this as invalid? > http://volta.cs.utah.edu:8080/z/UutRJe You're doing it right; I just think the online version is a bit old. John, could you check please?
I just updated both LLVM and Alive2. The example is still reading as correct, and I have made sure I'm not seeing cached results.
(In reply to Sanjay Patel from comment #2) > That is, do we need to use the datalayout in addition to 'inbounds' to > decide how to limit this transform? For the given testcase, we only produce the wrong result for precisely x == 0x80000009. We could mask off the high bit to produce the correct result, I think. The general form of the transform is something like "(x & (-1 >> TrailingBits(sizeof(IndexedType)))) == n".
I could reproduce this with 64 bits offset: http://volta.cs.utah.edu:8080/z/x-9fib Alive2 says the offset can be truncated and compared as well: http://volta.cs.utah.edu:8080/z/3ABHi3
Right, both the target data & sizeof(type) are important because the bug is triggered as Eli say when sext(idx) * sizeof(type) overflows. It seems there's a bug in Alive2 and/or the online tool ATM, sorry. We are looking into it. The example I posted is valid though.
Alive2 + CE problem was my fault. Should be fixed now, sorry about that.
Alive2 link: https://alive2.llvm.org/ce/z/bGG9Vy
Fixed: https://reviews.llvm.org/D99481