Created attachment 21449 [details] bug 1 Consider this program: ``` define i32 @main() { entry: %call = tail call i64 @f(i8 249, i8 101) %call1 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i64 %call) ret i32 0 } declare i32 @printf(i8* nocapture readonly, ...) @.str = private unnamed_addr constant [4 x i8] c"%lu\00", align 1 define i64 @f(i8 %var2, i8 %var9) { %var6 = trunc i8 %var9 to i1 %var7 = trunc i8 175 to i1 %var3 = sub nsw i1 %var6, %var7 %var4 = icmp eq i64 1114591064, 1114591064 %var1 = udiv i1 %var3, %var4 %var0 = trunc i8 %var2 to i1 %res = sub nsw nuw i1 %var0, %var1 %res.cast = zext i1 %res to i64 ret i64 %res.cast } ``` This program should print 1, but it prints 0 after being compiled with -O0. $ clang --version clang version 9.0.0 (https://github.com/llvm-mirror/clang.git 23cffd780f5647d02c6c12491f9b2c928757c798) (https://github.com/llvm-mirror/llvm.git 903a5691e0214c29a4da3796f9a4d598ad3d18f3) Target: x86_64-unknown-linux-gnu $ clang -O0 -o a.out 2332174.ll $ ./a.out 0 I attach a few more programs which should print 1 but they print 0.
Created attachment 21450 [details] buggy inputs
Not sure if this is the whole problem, but TargetLowering::SimplifySetCC(): // (Z-X) == X --> Z == X<<1 This doesn't check to see if the operands are i1 (bool), so the shift produces poison/undef. We don't have this problem in IR because instcombine knows that add/sub of i1 is really an 'xor': if (Ty->isIntOrIntVectorTy(1)) return BinaryOperator::CreateXor(LHS, RHS);
(In reply to Sanjay Patel from comment #2) > Not sure if this is the whole problem, but TargetLowering::SimplifySetCC(): > // (Z-X) == X --> Z == X<<1 > > This doesn't check to see if the operands are i1 (bool), so the shift > produces poison/undef. > > We don't have this problem in IR because instcombine knows that add/sub of > i1 is really an 'xor': > > if (Ty->isIntOrIntVectorTy(1)) > return BinaryOperator::CreateXor(LHS, RHS); That might've been better stated as: We don't have this problem in IR because we don't do this questionable transform. Ie, why is a shift better than a subtract here? I took out the fold completely, and no regression tests fail, but I guess it's better to guard against the bug as a 1st step. Note that just adding the transforms for 'add/sub i1 -> xor i1' is not enough to fix the bug because we can reach the setcc first. (But adding those transforms might be a good change independently.)
(In reply to Sanjay Patel from comment #3) > > Ie, why is a shift better than a subtract here? I took out the fold > completely, and no regression tests fail, but I guess it's better to guard > against the bug as a 1st step. TBH, if this causes no test regressions I'd vote for removing it entirely straight away.
(In reply to Sanjay Patel from comment #3) > (In reply to Sanjay Patel from comment #2) > > Not sure if this is the whole problem, but TargetLowering::SimplifySetCC(): > > // (Z-X) == X --> Z == X<<1 > > > > This doesn't check to see if the operands are i1 (bool), so the shift > > produces poison/undef. > > > > We don't have this problem in IR because instcombine knows that add/sub of > > i1 is really an 'xor': > > > > if (Ty->isIntOrIntVectorTy(1)) > > return BinaryOperator::CreateXor(LHS, RHS); > > That might've been better stated as: > We don't have this problem in IR because we don't do this questionable > transform. > > Ie, why is a shift better than a subtract here? I took out the fold > completely, and no regression tests fail, but I guess it's better to guard > against the bug as a 1st step. > > Note that just adding the transforms for 'add/sub i1 -> xor i1' is not > enough to fix the bug because we can reach the setcc first. (But adding > those transforms might be a good change independently.) Hello Sanjay, I'm not sure whether X<<1 returns undef/poison in SelDAG (It is poison in IR, but SelDag is a lower level language), but I agree that the transformation can be the source of the bug, as SelDag and IR seems to share the semantics of arithmetic operations. Folding '(Z-X) == X' to 'Z == X<<1' seems problematic when X is 1-bit size. I found that disabling the folding fixes 8 buggy cases including the one in the text but 3 cases still remain: 1673800.ll: prints 0 8566692.ll: prints 0 9510019.ll: prints 0 All of them should print 1. I tried to track which transformation was responsible for this output, but I couldn't. Do you have any idea? p.s: 66787.ll should print 15, and disabling '(Z-X) == X' to 'Z == X<<1' correctly fixes it. I said that all programs should run 1 at comment #1, but 66787.ll was an exception, sorry.
(In reply to Juneyoung Lee from comment #5) > I'm not sure whether X<<1 returns undef/poison in SelDAG (It is poison in > IR, but SelDag is a lower level language), SDAG has poison because it propagates the wrapping/exact flags from IR. As a practical matter, it is harder to see poison/undef problems in this layer because: 1. Most of them would be exposed in the IR optimizer first. 2. SDAG does not have as many of the poison/undef-based transforms that IR performs. 3. The bug becomes invisible in the translation to the target's assembly. > I found that disabling the folding fixes 8 buggy cases including the one in > the text but 3 cases still remain: > > 1673800.ll: prints 0 > 8566692.ll: prints 0 > 9510019.ll: prints 0 > > All of them should print 1. > I tried to track which transformation was responsible for this output, but I > couldn't. Do you have any idea? Not yet - but let's take this 1 step at a time. I think there's no question about the 1st fix, so I have committed that here: https://reviews.llvm.org/rL353615
(In reply to Simon Pilgrim from comment #4) > (In reply to Sanjay Patel from comment #3) > > > > Ie, why is a shift better than a subtract here? I took out the fold > > completely, and no regression tests fail, but I guess it's better to guard > > against the bug as a 1st step. > > TBH, if this causes no test regressions I'd vote for removing it entirely > straight away. I added some tests to see if there was anything else doing the same thing. There isn't, and there is a minor potential benefit to the shift (which gets translated to add somewhere in X86ISelDAGToDAG.cpp), so I think it's ok to keep.
(In reply to Sanjay Patel from comment #7) > I added some tests to see if there was anything else doing the same thing. > There isn't, and there is a minor potential benefit to the shift (which gets > translated to add somewhere in X86ISelDAGToDAG.cpp), so I think it's ok to > keep. That was here: https://reviews.llvm.org/rL353619
(In reply to Sanjay Patel from comment #6) > > 1673800.ll: prints 0 This is the same problem from what I see. The code that was just fixed is duplicated later for the commuted case: // X == (Z-X) --> X<<1 == Z This whole "function" is a giant mess, so I'll see if I can reduce the code duplication.
All attached examples should be fixed after: https://reviews.llvm.org/rL353639 Feel free to reopen if that's not correct. Side note about fuzzing for LLVM bugs: If you're looking for backend bugs, it might be worth trying something like this: $ clang -O2 test.c -S -emit-llvm -Xclang -disable-llvm-optzns -o unoptimized_ir.ll $ llc -O2 unoptimized_ir.ll -o unoptimized_ir_optimized_asm.s $ clang unoptimized_ir_optimized_asm.s -o maybe_buggy_executable $ clang test.c -o reference_executable $ { compare output of } reference_executable maybe_buggy_executable A lot of people get that 1st step wrong: you want clang to create an IR file that allows optimization by the backend, but skip intermediate optimization. That doesn't happen if you use "-O0 -emit-llvm" with clang.
Thank you for the advice! :)