LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 40657 - IR programs printing incorrect results after being compiled with -O0
Summary: IR programs printing incorrect results after being compiled with -O0
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Common Code Generator Code (show other bugs)
Version: trunk
Hardware: PC Linux
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-02-07 23:53 PST by Juneyoung Lee
Modified: 2019-02-11 00:27 PST (History)
7 users (show)

See Also:
Fixed By Commit(s): r353615, r353639


Attachments
bug 1 (712 bytes, text/plain)
2019-02-07 23:53 PST, Juneyoung Lee
Details
buggy inputs (5.06 KB, application/x-zip-compressed)
2019-02-07 23:54 PST, Juneyoung Lee
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Juneyoung Lee 2019-02-07 23:53:00 PST
Created attachment 21449 [details]
bug 1

Consider this program:

```

define i32 @main() {
entry:
  %call = tail call i64 @f(i8 249, i8 101)
  %call1 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i64 %call)
  ret i32 0
}

declare i32 @printf(i8* nocapture readonly, ...)

@.str = private unnamed_addr constant [4 x i8] c"%lu\00", align 1
define i64 @f(i8 %var2, i8 %var9) {
  %var6 = trunc i8 %var9 to i1
  %var7 = trunc i8 175 to i1
  %var3 = sub nsw i1 %var6, %var7
  %var4 = icmp eq i64 1114591064, 1114591064
  %var1 = udiv i1 %var3, %var4
  %var0 = trunc i8 %var2 to i1
  %res = sub nsw nuw i1 %var0, %var1
  %res.cast = zext i1 %res to i64
  ret i64 %res.cast
}
```

This program should print 1, but it prints 0 after being compiled with -O0.

$ clang --version
clang version 9.0.0 (https://github.com/llvm-mirror/clang.git 23cffd780f5647d02c6c12491f9b2c928757c798) (https://github.com/llvm-mirror/llvm.git 903a5691e0214c29a4da3796f9a4d598ad3d18f3)
Target: x86_64-unknown-linux-gnu
$ clang -O0 -o a.out 2332174.ll
$ ./a.out
0

I attach a few more programs which should print 1 but they print 0.
Comment 1 Juneyoung Lee 2019-02-07 23:54:29 PST
Created attachment 21450 [details]
buggy inputs
Comment 2 Sanjay Patel 2019-02-08 13:40:46 PST
Not sure if this is the whole problem, but TargetLowering::SimplifySetCC():
            // (Z-X) == X  --> Z == X<<1

This doesn't check to see if the operands are i1 (bool), so the shift produces poison/undef.

We don't have this problem in IR because instcombine knows that add/sub of i1 is really an 'xor':

  if (Ty->isIntOrIntVectorTy(1))
    return BinaryOperator::CreateXor(LHS, RHS);
Comment 3 Sanjay Patel 2019-02-08 15:25:51 PST
(In reply to Sanjay Patel from comment #2)
> Not sure if this is the whole problem, but TargetLowering::SimplifySetCC():
>             // (Z-X) == X  --> Z == X<<1
> 
> This doesn't check to see if the operands are i1 (bool), so the shift
> produces poison/undef.
> 
> We don't have this problem in IR because instcombine knows that add/sub of
> i1 is really an 'xor':
> 
>   if (Ty->isIntOrIntVectorTy(1))
>     return BinaryOperator::CreateXor(LHS, RHS);

That might've been better stated as:
We don't have this problem in IR because we don't do this questionable transform. 

Ie, why is a shift better than a subtract here? I took out the fold completely, and no regression tests fail, but I guess it's better to guard against the bug as a 1st step.

Note that just adding the transforms for 'add/sub i1 -> xor i1' is not enough to fix the bug because we can reach the setcc first. (But adding those transforms might be a good change independently.)
Comment 4 Simon Pilgrim 2019-02-09 03:59:49 PST
(In reply to Sanjay Patel from comment #3)
> 
> Ie, why is a shift better than a subtract here? I took out the fold
> completely, and no regression tests fail, but I guess it's better to guard
> against the bug as a 1st step.

TBH, if this causes no test regressions I'd vote for removing it entirely straight away.
Comment 5 Juneyoung Lee 2019-02-09 04:35:58 PST
(In reply to Sanjay Patel from comment #3)
> (In reply to Sanjay Patel from comment #2)
> > Not sure if this is the whole problem, but TargetLowering::SimplifySetCC():
> >             // (Z-X) == X  --> Z == X<<1
> > 
> > This doesn't check to see if the operands are i1 (bool), so the shift
> > produces poison/undef.
> > 
> > We don't have this problem in IR because instcombine knows that add/sub of
> > i1 is really an 'xor':
> > 
> >   if (Ty->isIntOrIntVectorTy(1))
> >     return BinaryOperator::CreateXor(LHS, RHS);
> 
> That might've been better stated as:
> We don't have this problem in IR because we don't do this questionable
> transform. 
> 
> Ie, why is a shift better than a subtract here? I took out the fold
> completely, and no regression tests fail, but I guess it's better to guard
> against the bug as a 1st step.
> 
> Note that just adding the transforms for 'add/sub i1 -> xor i1' is not
> enough to fix the bug because we can reach the setcc first. (But adding
> those transforms might be a good change independently.)

Hello Sanjay,

I'm not sure whether X<<1 returns undef/poison in SelDAG (It is poison in IR, but SelDag is a lower level language), but I agree that the transformation can be the source of the bug, as SelDag and IR seems to share the semantics of arithmetic operations.
Folding '(Z-X) == X' to 'Z == X<<1' seems problematic when X is 1-bit size.

I found that disabling the folding fixes 8 buggy cases including the one in the text but 3 cases still remain:

1673800.ll: prints 0                
8566692.ll: prints 0                
9510019.ll: prints 0                

All of them should print 1.
I tried to track which transformation was responsible for this output, but I couldn't. Do you have any idea?

p.s: 66787.ll should print 15, and disabling '(Z-X) == X' to 'Z == X<<1' correctly fixes it. I said that all programs should run 1 at comment #1, but 66787.ll was an exception, sorry.
Comment 6 Sanjay Patel 2019-02-09 08:03:08 PST
(In reply to Juneyoung Lee from comment #5)
> I'm not sure whether X<<1 returns undef/poison in SelDAG (It is poison in
> IR, but SelDag is a lower level language), 

SDAG has poison because it propagates the wrapping/exact flags from IR. As a practical matter, it is harder to see poison/undef problems in this layer because:
1. Most of them would be exposed in the IR optimizer first.
2. SDAG does not have as many of the poison/undef-based transforms that IR performs.
3. The bug becomes invisible in the translation to the target's assembly. 


> I found that disabling the folding fixes 8 buggy cases including the one in
> the text but 3 cases still remain:
> 
> 1673800.ll: prints 0                
> 8566692.ll: prints 0                
> 9510019.ll: prints 0                
> 
> All of them should print 1.
> I tried to track which transformation was responsible for this output, but I
> couldn't. Do you have any idea?

Not yet - but let's take this 1 step at a time. I think there's no question about the 1st fix, so I have committed that here:
https://reviews.llvm.org/rL353615
Comment 7 Sanjay Patel 2019-02-09 09:08:28 PST
(In reply to Simon Pilgrim from comment #4)
> (In reply to Sanjay Patel from comment #3)
> > 
> > Ie, why is a shift better than a subtract here? I took out the fold
> > completely, and no regression tests fail, but I guess it's better to guard
> > against the bug as a 1st step.
> 
> TBH, if this causes no test regressions I'd vote for removing it entirely
> straight away.

I added some tests to see if there was anything else doing the same thing. There isn't, and there is a minor potential benefit to the shift (which gets translated to add somewhere in X86ISelDAGToDAG.cpp), so I think it's ok to keep.
Comment 8 Sanjay Patel 2019-02-09 09:14:33 PST
(In reply to Sanjay Patel from comment #7)
> I added some tests to see if there was anything else doing the same thing.
> There isn't, and there is a minor potential benefit to the shift (which gets
> translated to add somewhere in X86ISelDAGToDAG.cpp), so I think it's ok to
> keep.

That was here:
https://reviews.llvm.org/rL353619
Comment 9 Sanjay Patel 2019-02-09 09:31:33 PST
(In reply to Sanjay Patel from comment #6) 
> > 1673800.ll: prints 0                

This is the same problem from what I see. The code that was just fixed is duplicated later for the commuted case:
          // X == (Z-X)  --> X<<1 == Z

This whole "function" is a giant mess, so I'll see if I can reduce the code duplication.
Comment 10 Sanjay Patel 2019-02-10 07:03:46 PST
All attached examples should be fixed after:
https://reviews.llvm.org/rL353639

Feel free to reopen if that's not correct.

Side note about fuzzing for LLVM bugs:
If you're looking for backend bugs, it might be worth trying something like this:

$ clang -O2 test.c -S -emit-llvm -Xclang -disable-llvm-optzns -o unoptimized_ir.ll
$ llc -O2 unoptimized_ir.ll -o unoptimized_ir_optimized_asm.s
$ clang unoptimized_ir_optimized_asm.s -o maybe_buggy_executable
$ clang test.c -o reference_executable
$ { compare output of } reference_executable maybe_buggy_executable

A lot of people get that 1st step wrong: you want clang to create an IR file that allows optimization by the backend, but skip intermediate optimization. That doesn't happen if you use "-O0 -emit-llvm" with clang.
Comment 11 Juneyoung Lee 2019-02-11 00:27:47 PST
Thank you for the advice! :)