I found an error in instcombine pass. Please look at the b.ll file attached. At line 956, the extension operation is sext. After instcombine, it has been changed to zext which causes semantic errors.
Created attachment 707 [details] the ll file The input file for opt -instcombine
Created attachment 708 [details] The output file The output file after instcombine.
Testcase: define i16 @test(i31 %zzz) { entry: %A = sext i31 %zzz to i32 %B = add i32 %A, 16384 %C = lshr i32 %B, 15 %D = trunc i32 %C to i16 ret i16 %D }
The differences are these (main ones highlighted with *** 360c360 < %newPhi117 = phi i8 [ 0, %bb586.loopexit.i ], [ %cast_tmp137, %bb95.i ]; <i8> [#uses=1] --- >%newPhi117 = phi i8 [ 0, %bb586.loopexit.i ], [ %cast_tmp137, %bb95.i ]; <i8> [#uses=2] 363,364c363,364 < %cast_tmp138 = zext i8 %newPhi117 to i32 ; <i32> [#uses=42] < %indvar39.i74.0 = trunc i32 %cast_tmp138 to i16 ; <i16> [#uses=1] --- > %cast_tmp138 = zext i8 %newPhi117 to i32 ; <i32> [#uses=41] > %indvar39.i74.0 = zext i8 %newPhi117 to i16 ; <i16> [#uses=1] 520c520 < %tmp470.i = sub i32 -9, %cast_tmp138 ; <i32> [#uses=1] --- > %tmp470.i = xor i32 %cast_tmp138, -1 ; <i32> [#uses=1] 770,771c770 < %tmp639640.i = sext i16 %tmp639.i to i32 ; <i32> [#uses=1] < %cast_tmp135 = trunc i32 %tmp639640.i to i26 ; <i26> [#uses=2] --- > %cast_tmp135 = sext i16 %tmp639.i to i26 ; <i26> [#uses=2] 885,886c884 < %tmp910.i = sext i16 %tmp9.i to i32 ; <i32> [#uses=1] < %cast_tmp132 = trunc i32 %tmp910.i to i28 ; <i28> [#uses=1] --- > %cast_tmp132 = sext i16 %tmp9.i to i28 ; <i28> [#uses=1] 924,925c922 < %tmp5556.i = sext i16 %tmp55.i to i32 ; <i32> [#uses=1] < %cast_tmp130 = trunc i32 %tmp5556.i to i30 ; <i30> [#uses=1] --- > %cast_tmp130 = sext i16 %tmp55.i to i30 ; <i30> [#uses=1] 963,964c960 < %tmp106107.i = sext i16 %tmp106.i12 to i32 ; <i32> [#uses=1] < %cast_tmp128 = trunc i32 %tmp106107.i to i31 ; <i31> [#uses=1] --- > %cast_tmp128 = sext i16 %tmp106.i12 to i31 ; <i31> [#uses=1] 966,967c962,963 ******* < %cast_tmp127 = sext i31 %newbin121 to i32 ; <i32> [#uses=1] < %tmp109.i = add i32 %cast_tmp127, 16384 ; <i32> [#uses=1] --- > %cast_tmp1272 = zext i31 %newbin121 to i32 ; <i32> [#uses=1] > %tmp109.i = add i32 %cast_tmp1272, 16384 ; <i32> [#uses=1] 1003,1004c999 < %tmp157158.i = sext i16 %tmp157.i to i32 ; <i32> [#uses=1] < %cast_tmp125 = trunc i32 %tmp157158.i to i31 ; <i31> [#uses=1] --- > %cast_tmp125 = sext i16 %tmp157.i to i31 ; <i31> [#uses=1] 1006,1007c1001,1002 ******** < %cast_tmp124 = sext i31 %newbin123 to i32 ; <i32> [#uses=1] < %tmp160.i19 = add i32 %cast_tmp124, 16384 ; <i32> [#uses=1] --- > %cast_tmp1241 = zext i31 %newbin123 to i32 ; <i32> [#uses=1] > %tmp160.i19 = add i32 %cast_tmp1241, 16384 ; <i32> [#uses=1]
Mine.
Leo, Please don't file any more of these InstCombine bugs. We know that InstructionCombining is broken for bitwidths like i31 and i26. Sheng and I are working to resolve this. When you file a bug like this it causes me to instantly stop working on things and check this bug out. I could have just spent the last hour actually solving your problem by working on InstCombine. Instead I had to investigate this bug which is a duplicate of one you have filed yourself. This bug is merely a reflection of SimplifyDemandedBits not supporting i31.
*** This bug has been marked as a duplicate of 1205 ***
It seems this is a bug after all. After bug 1205 was closed (Instcombine changes to support APInt), I reran the test case. It produced: define i16 @test(i31 %zzz) { entry: %A1 = zext i31 %zzz to i32 ; <i32> [#uses=1] %B = add i32 %A1, 16384 ; <i32> [#uses=1] %C = lshr i32 %B, 15 ; <i32> [#uses=1] %D = trunc i32 %C to i16 ; <i16> [#uses=1] ret i16 %D } i.e. the "sext" changed to zext without any other compensating changes. This needs to be investigated freshly to determine if the zext is okay or if this is truly a bug.
This is clearly a bug. The sign bit in the 31st bit (extended bit) is still live even after the lshr by 15 bits and the truncation.
SimplifyDemandedBits is computing a DemandedMask that doesn't include the high bit after the sext so it thinks this bit is not needed in the result. Consequently it opts to replace the sext with a zext because (perhaps) that's more efficient? I'm investigating why the DemandedMask doesn't have the high bit set.
The Demanded Mask makes these transitions: ret i16 %D 0xFFFF (all bits demanded in %D) %D = trunc i32 %C to i16 0x0000FFFF (widen to include trunc'd bits) %C = lshr i32 %B, 15 0x7FFF8000 (get shifted bits) %B = add i32 %A, 16384 0x7FFFFFFF (16 bit value added affects low bits) %A = sext i31 %zzz to i32 0x7FFFFFFF Despite my previous comment, this looks correct to me. The high bit is not included in the shift because the truncation occurs at 16 bits while the shift was 15 bits. This is reflected in the DemandedMask shown above. In this sitaution, InstCombine opts to replace a sext with an equivalent zext because the high bit is not preserved downstream. The code looks like: // If the input sign bit is known zero, or if the NewBits are not demanded // convert this into a zero extension. if (RHSKnownZero[SrcBitWidth-1] || (NewBits & ~DemandedMask) == NewBits) { // Convert to ZExt cast CastInst *NewCast = new ZExtInst(I->getOperand(0), VTy, I->getName(), I); return UpdateValueUsesWith(I, NewCast); Presumably this is because zext is cheaper than sext on some hardware. So, I'm claiming that the sext -> zext is correct and appropriate.
This xform is definitely correct.