44861 – InstCombine incorrectly rewrites indices of gep inbounds

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 44861 - InstCombine incorrectly rewrites indices of gep inbounds

Summary: InstCombine incorrectly rewrites indices of gep inbounds

Status:	NEW

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Scalar Optimizations (show other bugs)
Version:	trunk
Hardware:	PC All

Importance:	P normal
Assignee:	Unassigned LLVM Bugs

URL:
Keywords:

Depends on:
Blocks:

Reported:	2020-02-09 19:14 PST by Juneyoung Lee
Modified:	2020-02-14 06:05 PST (History)
CC List:	7 users (show)

See Also:
Fixed By Commit(s):

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Juneyoung Lee 2020-02-09 19:14:38 PST

```
$ cat input.ll
declare void @g(i8*)

define void @foo(i8* %ptr, i1 %cc1, i1 %cc2, i1 %cc3, i32 %arg) {
entry:
  br i1 %cc1, label %do.end, label %if.then

if.then:
  br i1 %cc2, label %do.body, label %do.end

do.body:
  %phi = phi i32 [ %arg, %if.then ]
  %phi.ext = zext i32 %phi to i64
  %ptr2 = getelementptr inbounds i8, i8* %ptr, i64 %phi.ext
  %ptr3 = getelementptr inbounds i8, i8* %ptr2, i64 -1
  call void @g(i8* %ptr3)
  br i1 %cc3, label %do.end, label %if.then

do.end:
  ret void
}

$ opt -instcombine -S -o - input.ll
declare void @g(i8*)

define void @foo(i8* %ptr, i1 %cc1, i1 %cc2, i1 %cc3, i32 %arg) {
  br i1 %cc1, label %do.end, label %if.then

if.then:                                          ; preds = %do.body, %entry
  br i1 %cc2, label %do.body, label %do.end

do.body:                                          ; preds = %if.then
  %phi.ext = zext i32 %arg to i64
  %ptr2 = getelementptr inbounds i8, i8* %ptr, i64 -1
  %ptr3 = getelementptr inbounds i8, i8* %ptr2, i64 %phi.ext
  call void @g(i8* nonnull %ptr3)
  br i1 %cc3, label %do.end, label %if.then

do.end:                                           ; preds = %do.body, %if.then, %entry
  ret void
}
```

If %arg is 1, ptr3 in the source is `gep inb (gep inb ptr, 1), -1` whereas ptr3 in the target is `gep inb (gep inb ptr, -1), 1`.
This is incorrect because ptr3 in the tgt may be poison whereas ptr3 in the source isn't.

Comment 1 Juneyoung Lee 2020-02-14 06:05:47 PST

I track down reason and it was because InstCombine::visitGetElementPtrInst try to swap index operands of geps when it can exhibit more opportunities to LICM:

```
/* at InstructionCombining.cpp */
...
    // Try to reassociate loop invariant GEP chains to enable LICM.
    if (LI && Src->getNumOperands() == 2 && GEP.getNumOperands() == 2 &&
        Src->hasOneUse()) {
...
```

ptr3's gep has a constant index (-1), so InstCombine chose to swap it with ptr2's index (%phi.ext) to enable code motion of the updated ptr2.

Here are a few solutions that I have:

(1) do this transformation when the two indices are known to have the same sign or either of those is zero.
For example, if ptr3's index is 1 instead of -1, this transformation is allowed.
(2) drop inbound flags when doing the transformation.