45210 – Incorrect optimization of gep without inbounds + load -> icmp eq

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 45210 - Incorrect optimization of gep without inbounds + load -> icmp eq

Summary: Incorrect optimization of gep without inbounds + load -> icmp eq

Status:	RESOLVED FIXED

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Scalar Optimizations (show other bugs)
Version:	trunk
Hardware:	All All

Importance:	P normal
Assignee:	Unassigned LLVM Bugs

URL:
Keywords:	miscompilation

Depends on:
Blocks:

Reported:	2020-03-15 04:31 PDT by Nuno Lopes
Modified:	2021-05-31 10:24 PDT (History)
CC List:	7 users (show)

See Also:
Fixed By Commit(s):

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Nuno Lopes 2020-03-15 04:31:29 PDT

The following unit test in "Transforms/InstCombine/load-cmp.ll" exposes an incorrect optimization:

define i1 @test1_noinbounds(i32 %X) {
; CHECK-LABEL: @test1_noinbounds(
; CHECK-NEXT:    [[R:%.*]] = icmp eq i32 [[X:%.*]], 9
; CHECK-NEXT:    ret i1 [[R]]
;
  %P = getelementptr [10 x i16], [10 x i16]* @G16, i32 0, i32 %X
  %Q = load i16, i16* %P
  %R = icmp eq i16 %Q, 0
  ret i1 %R
}


Output of Alive2. TL;DR: the optimization is only correct with gep inbounds, otherwise the transformation to "icmp eq" misses the overflow case.

@G16 = constant 20 bytes, align 16

define i1 @test1_noinbounds(i32 %X) {
#init:
  store [10 x i16] { 35, 82, 69, 81, 85, 73, 82, 69, 68, 0 }, * @G16, align 16
  br label %0

%0:
  %P = gep * @G16, 20 x i32 0, 2 x i32 %X
  %Q = load i16, * %P, align 2
  %R = icmp eq i16 %Q, 0
  ret i1 %R
}
=>
@G16 = constant 20 bytes, align 16

define i1 @test1_noinbounds(i32 %X) {
#init:
  store [10 x i16] { 35, 82, 69, 81, 85, 73, 82, 69, 68, 0 }, * @G16, align 16
  br label %0

%0:
  %R = icmp eq i32 %X, 9
  ret i1 %R
}


Transformation doesn't verify!
ERROR: Value mismatch

Example:
i32 %X = #x80000009 (2147483657, -2147483639)

Source:
* %P = pointer(non-local, block_id=1, offset=18)
i16 %Q = #x0000 (0)
i1 %R = #x1 (1)

Target:
i1 %R = #x0 (0)
Source value: #x1 (1)
Target value: #x0 (0)

Comment 1 Sanjay Patel 2020-03-17 10:39:26 PDT

Not sure if I'm using it correctly, but the online instance of Alive2 does not show this as invalid?
http://volta.cs.utah.edu:8080/z/UutRJe

Comment 2 Sanjay Patel 2020-03-17 10:55:10 PDT

The quick fix:

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index 63bf3462faa..c4d7a956025 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -159,7 +159,7 @@ Instruction *InstCombiner::foldCmpLoadFromIndexedGlobal(GetElementPtrInst *GEP,
   // the simple index into a single-dimensional array.
   //
   // Require: GEP GV, 0, i {{, constant indices}}
-  if (GEP->getNumOperands() < 3 ||
+  if (!GEP->isInBounds() || GEP->getNumOperands() < 3 ||
       !isa<ConstantInt>(GEP->getOperand(1)) ||
       !cast<ConstantInt>(GEP->getOperand(1))->isZero() ||
       isa<Constant>(GEP->getOperand(2)))


But that causes 5 other test changes (no more optimization) in that same regression test file. 

I think the tests were meant to exercise various possibilities for that index variable such as "what if its size is too small to cause an overflow?" 

That is, do we need to use the datalayout in addition to 'inbounds' to decide how to limit this transform?

Comment 3 Nuno Lopes 2020-03-17 10:55:59 PDT

(In reply to Sanjay Patel from comment #1)
> Not sure if I'm using it correctly, but the online instance of Alive2 does
> not show this as invalid?
> http://volta.cs.utah.edu:8080/z/UutRJe

You're doing it right; I just think the online version is a bit old.
John, could you check please?

Comment 4 John Regehr 2020-03-17 13:43:51 PDT

I just updated both LLVM and Alive2. The example is still reading as correct, and I have made sure I'm not seeing cached results.

Comment 5 Eli Friedman 2020-03-17 14:23:50 PDT

(In reply to Sanjay Patel from comment #2)
> That is, do we need to use the datalayout in addition to 'inbounds' to
> decide how to limit this transform?

For the given testcase, we only produce the wrong result for precisely x == 0x80000009.  We could mask off the high bit to produce the correct result, I think.  The general form of the transform is something like "(x & (-1 >> TrailingBits(sizeof(IndexedType)))) == n".

Comment 6 Juneyoung Lee 2020-03-17 17:38:36 PDT

I could reproduce this with 64 bits offset:
http://volta.cs.utah.edu:8080/z/x-9fib

Alive2 says the offset can be truncated and compared as well:
http://volta.cs.utah.edu:8080/z/3ABHi3

Comment 7 Nuno Lopes 2020-03-18 10:37:49 PDT

Right, both the target data & sizeof(type) are important because the bug is triggered as Eli say when sext(idx) * sizeof(type) overflows.

It seems there's a bug in Alive2 and/or the online tool ATM, sorry. We are looking into it. The example I posted is valid though.

Comment 8 John Regehr 2020-03-18 10:48:56 PDT

Alive2 + CE problem was my fault. Should be fixed now, sorry about that.

Comment 9 Eli Friedman 2020-07-08 13:52:32 PDT

Alive2 link: https://alive2.llvm.org/ce/z/bGG9Vy

Comment 10 Nuno Lopes 2021-05-31 10:24:49 PDT

Fixed: https://reviews.llvm.org/D99481