Created attachment 19760 [details] O3 output LLVM5.0 The `llvm5.ll` is the optimized (-O3) output of LLVM 5.0 opt of a simple testcase compiled by LDC. The testcase is setting %dt and %dt2 to the same value by different means, and asserts that they are equal indeed. `llvm6.ll` is the output of opt 6.0.0 when optimizing `llvm5.ll`: reproducer= `opt -O3 llvm5.ll -S -o llvm6.ll`. LLVM6.0 does a misoptimization and the assert fails. Zooming in: ``` %datum.DateTime = type { %datum.Date, %datum.TimeOfDay, [1 x i8] } %datum.Date = type { i16, i8, [1 x i8] } %datum.TimeOfDay = type { i8, i8, i8 } define i32 @main.... %dt2 = alloca i64, align 8 %tmpcast = bitcast i64* %dt2 to %datum.DateTime* ;... %5 = bitcast i64* %dt2 to i8* store i64 1, i64* %dt2, align 8 %6 = getelementptr inbounds %datum.DateTime, %datum.DateTime* %tmpcast, i64 0, i32 1, i32 0 call void @llvm.memset.p0i8.i64(i8* %6, i8 0, i64 3, i32 4, i1 false) #5 %7 = getelementptr inbounds %datum.DateTime, %datum.DateTime* %tmpcast, i64 0, i32 1, i32 1 store i8 30, i8* %7, align 1 ``` is optimized to ``` define i32 @main.... %dt2 = alloca i64, align 8 %tmpcast = bitcast i64* %dt2 to %datum.DateTime* ;... %5 = bitcast i64* %dt2 to i8* ; The following store is correct I think store i64 32985348833281, i64* %dt2, align 8 %6 = getelementptr inbounds %datum.DateTime, %datum.DateTime* %tmpcast, i64 0, i32 1, i32 0 ; This memset overwrites partly the correct value. ; I think if this memset would not be there, all would be fine. call void @llvm.memset.p0i8.i64(i8* nonnull %6, i8 0, i64 3, i32 4, i1 false) #4 ``` [Remark: I can see that perhaps endianness is tricky here. But actually, the LLVM optimizer itself converted the type of %dt2 to i64* (instead of %datum.DateTime*).]
Created attachment 19761 [details] llvm6.ll output of LLVM6.0's opt on with llvm5.ll as input
I believe this is a case of: define void @test38(i32* %P, i32* %Q) { store i32 1, i32* %P %P2 = bitcast i32* %P to i8* store i32 2, i32* %Q store i8 3, i8* %P2 ret void } store 3 cannot be merged into store 1, because the store 2 to Q mayalias P. The PartialStoreMerging we have doesn't take this into account I don't think.
Thanks for cc'ing me, Dave. For reference, partial store merging was: https://reviews.llvm.org/D30703 https://reviews.llvm.org/rL314206 I commandeered that patch, but I still don't know much about this pass. Let me see if I can take alias analysis into account, so we avoid this.
Maybe a check of memoryIsNotModifiedBetween is all we need? Same as noop stores. It may not be the most efficient, but I think checks what we want.
(In reply to Dave Green from comment #4) > Maybe a check of memoryIsNotModifiedBetween is all we need? Same as noop > stores. It may not be the most efficient, but I think checks what we want. Sounds good to me...patch coming up. Thanks!
https://reviews.llvm.org/D42663
Should this be a 6.00 blocker?
(In reply to Simon Pilgrim from comment #7) > Should this be a 6.00 blocker? Yes, I think so - it's a recently introduced miscompile.
(In reply to Sanjay Patel from comment #8) > (In reply to Simon Pilgrim from comment #7) > > Should this be a 6.00 blocker? > > Yes, I think so - it's a recently introduced miscompile. Thanks for elevating to a 6.0.0 blocker. LDC generates code that encounters this bug. The bug results in LDC failing its testsuite, so would be a blocker for releasing LDC linked to LLVM 6.0.0. (I haven't tried to make a testcase that would reproduce with Clang)
Fixed in trunk: https://reviews.llvm.org/rL323759 Leaving the bug open to allow baking time there and - assuming success - merging in the branch.
*** Bug 35923 has been marked as a duplicate of this bug. ***
I can confirm that the original D testcase now passes with LDC+LLVMtrunk. Thanks for the fix! (Sidenote: In the code, I read a TODO comment "Deal with [...] some mem intrinsics (if needed)". As you can see from the testcase IR, we generate a call to the memset intrinsic to set the padding bytes of structs to zero (padding bytes are put in arrays of i8). So for us, it'd be beneficial if the store merge code learns how to deal with that.)
(In reply to Johan Engelen from comment #12) > I can confirm that the original D testcase now passes with LDC+LLVMtrunk. > Thanks for the fix! > > (Sidenote: In the code, I read a TODO comment "Deal with [...] some mem > intrinsics (if needed)". As you can see from the testcase IR, we generate a > call to the memset intrinsic to set the padding bytes of structs to zero > (padding bytes are put in arrays of i8). So for us, it'd be beneficial if > the store merge code learns how to deal with that.) Thanks for verifying, Johan. Can you open a new bug report for the missed optimization? We don't want to lose track of that request when this report is closed.
(In reply to Sanjay Patel from comment #10) > Fixed in trunk: > https://reviews.llvm.org/rL323759 > > Leaving the bug open to allow baking time there and - assuming success - > merging in the branch. Merged in r324086. (In reply to Sanjay Patel from comment #13) > (In reply to Johan Engelen from comment #12) > > I can confirm that the original D testcase now passes with LDC+LLVMtrunk. > > Thanks for the fix! > > > > (Sidenote: In the code, I read a TODO comment "Deal with [...] some mem > > intrinsics (if needed)". As you can see from the testcase IR, we generate a > > call to the memset intrinsic to set the padding bytes of structs to zero > > (padding bytes are put in arrays of i8). So for us, it'd be beneficial if > > the store merge code learns how to deal with that.) > > Thanks for verifying, Johan. > > Can you open a new bug report for the missed optimization? We don't want to > lose track of that request when this report is closed. Was a new bug filed? I'd like to close this.
(In reply to Hans Wennborg from comment #14) > Was a new bug filed? I'd like to close this. Closing - I've taken a shot at what I think the problem is with bug 36211.