A shuffle undef mask should select either of the elements. If one of them happens to be poison, then the result can be poison. The langref [1] and a transformation in instcombine seems forgot considering the poison case. See below the transformation of insertelement to shufflevector. Assume the function input %arg is poison, the target program shuffles the vector <poison, undef, undef, undef> with mask <undef, 0, 0, 0>, the result should be <poison, poison, poison, poison>, whereas the source program evaluates to <undef, poison, poison, poison>. The target is more poisonous than the source. llvm/test/Transforms/InstCombine/broadcast.ll define <4 x float> @splat_undef1(float %arg) { ; CHECK-LABEL: @splat_undef1( ; CHECK-NEXT: [[TMP1:%.*]] = insertelement <4 x float> undef, float [[ARG:%.*]], i32 0 ; CHECK-NEXT: [[T6:%.*]] = shufflevector <4 x float> [[TMP1]], <4 x float> undef, <4 x i32> <i32 undef, i32 0, i32 0, i32 0> ; CHECK-NEXT: ret <4 x float> [[T6]] ; %t = insertelement <4 x float> undef, float %arg, i32 1 %t4 = insertelement <4 x float> %t, float %arg, i32 1 %t5 = insertelement <4 x float> %t4, float %arg, i32 2 %t6 = insertelement <4 x float> %t5, float %arg, i32 3 ret <4 x float> %t6 } [1] https://llvm.org/docs/LangRef.html#id186
(In reply to Zhengyang Liu from comment #0) > A shuffle undef mask should select either of the elements. If one of them > happens to be poison, then the result can be poison. The langref [1] and a > transformation in instcombine seems forgot considering the poison case. Unless I'm mis-remembering the history and based on the discussion in: https://reviews.llvm.org/D31980 (bug 32486) The example and LangRef are correct as-is. We decided that an undef element of a shuffle mask means the result really is undefined and does not need to be picked from one of the input vectors. I'm not sure if the definition of poison has been refined since that time to alter that decision. The current/existing IR transform makes the most sense for optimization. Ie, if we decide that the example needs to be changed to something more specific, then we may lose optimizations (probably not for this simple example, but something involving larger IR-to-IR transforms): define <4 x float> @less_poison(float %arg) { %t1 = insertelement <4 x float> undef, float %arg, i32 0 %t6 = shufflevector <4 x float> %t1, <4 x float> undef, <4 x i32> <i32 4, i32 0, i32 0, i32 0> ret <4 x float> %t6 } (The shuffle mask no longer contains an undef element, so that's not a simple "splat".)
Do we allow elements of a vector to be independently poisoned? LangRef currently says we don't, but I'm not sure that's actually what we implement. If we do allow elements to be independently poisoned, shufflevector needs to explicitly specify the interaction with poison, I think. Consider the following example: define <4 x float> @shuffle(<4 x float> %arg) { %shuf = shufflevector <4 x float> %arg, <4 x float> undef, <4 x i32> <i32 undef, i32 1, i32 2, i32 3> ret <4 x float> %shuf } instcombine currently combines away the shuffle. But if lane 0 is poison, that's propagating poison in a case where it wouldn't otherwise.
I believe a lot of nice vector combines would go away if poison was per-vector instead of per-element. This is something we could demonstrate using alive2 pretty easily, if anyone needs convincing.
Having some mask element = undef is equivalent to a mask that is OOB (since we can make the undef value take a large value). Right now Langref says that case gives undef, which is consistent with the optimization shown in this report. In Alive we implemented this OOB case as poison, since we really need to get rid of undef ASAP (to make GVN correct, for example). My additional worry with generating undef is that it's very easy to fall in the trap of "pick one of undef, %foo -> %foo", which is incorrect if %foo is poison. For example: define <2 x float> @shuffle(<2 x float> %arg) { %shuf = shufflevector <2 x float> %arg, <2 x float> undef, <2 x i32> <i32 0, i32 2> ret <2 x float> %shuf } instcombine transforms this to: define <2 x float> @shuffle(<2 x float> %arg) { ret <2 x float> %arg } https://godbolt.org/z/Ca6uGs This is incorrect in case %arg[1] is poison. (assuming poison is element-wise) It seems the optimization problems would go away if we were to initialize vector with poison values rather than undef values, like: define <4 x float> @splat_undef1(float %arg) { %t = insertelement <4 x float> poison, float %arg, i32 1 %t5 = insertelement <4 x float> %t, float %arg, i32 2 %t6 = insertelement <4 x float> %t5, float %arg, i32 3 ret <4 x float> %t6 } => define <4 x float> @less_poison(float %arg) { %t1 = insertelement <4 x float> poison, float %arg, i32 0 %t6 = shufflevector <4 x float> %t1, <4 x float> poison, <4 x i32> <i32 poison, i32 0, i32 0, i32 0> ret <4 x float> %t6 } This would be correct. We can change the semantics in Alive for the OOB case and make that undef for now, and revisit later when we introduce the poison value and commit to removing undef. The example above (@shuffle function) is incorrect nevertheless.
(In reply to Nuno Lopes from comment #4) > My additional worry with generating undef is that it's very easy to fall in > the trap of "pick one of undef, %foo -> %foo", which is incorrect if %foo is > poison. > > For example: > define <2 x float> @shuffle(<2 x float> %arg) { > %shuf = shufflevector <2 x float> %arg, <2 x float> undef, <2 x i32> <i32 > 0, i32 2> > ret <2 x float> %shuf > } > > instcombine transforms this to: > define <2 x float> @shuffle(<2 x float> %arg) { > ret <2 x float> %arg > } > > https://godbolt.org/z/Ca6uGs > > This is incorrect in case %arg[1] is poison. > (assuming poison is element-wise) This transform happens in 2 steps internally inside instcombine. The 1st step changes the shuffle mask element that is choosing the undef vector operand: %shuf = shufflevector <2 x float> %arg, <2 x float> undef, <2 x i32> <i32 0, i32 2> --> %shuf = shufflevector <2 x float> %arg, <2 x float> undef, <2 x i32> <i32 0, i32 undef> ---------------------------------------------------------------------------- The reasoning is: // If this shuffle is choosing an undef element from 1 of the sources, that // shuffle mask element is undef. SelectionDAG also has that kind of transform, so removing that in IR may not affect the end result that much. But we have poison in SDAG too, so we wouldn't completely fix this by changing instcombine alone. But IIUC, we're saying we can sweep this under the rug for now and wait for a poison value to be introduced as another (final?) step to removing undef. Once the shuffle mask has an undef element, we're back to the same example Eli noted in comment 2.
Ah, looking again at Eli's example: define <4 x float> @shuffle(<4 x float> %arg) { %shuf = shufflevector <4 x float> %arg, <4 x float> undef, <4 x i32> <i32 undef, i32 1, i32 2, i32 3> ret <4 x float> %shuf } => ret %arg I agree that we shouldn't be doing this because it transforms undef into poison. This would be fine if we used poison to initialize vectors instead of undef. For the example above, here's an end-to-end miscompilation: define i1 @fn(i32 %x) { %x_1 = add nsw i32 %x, 1 %v0 = insertelement <2 x i32> undef, i32 %x_1, i32 0 %v2 = call <2 x i32> @shuffle(<2 x i32> %v0) %val = extractelement <2 x i32> %v2, i32 0 %cmp = icmp sgt i32 %val, %x ret i1 %cmp } => define i1 @fn(i32 %x) { ret i1 true } https://godbolt.org/z/kMEXH5 If %v2 was <undef, %x_1> as it should, then %cmp couldn't be true, since "undef > %x" isn't always true. Since %v2 is (incorrectly) <%x_1, %x_1>, we get "%x +nsw 1 > %x", which is always true. So replacing poison with undef here isn't correct.
(In reply to Nuno Lopes from comment #6) > Ah, looking again at Eli's example: > > define <4 x float> @shuffle(<4 x float> %arg) { > %shuf = shufflevector <4 x float> %arg, <4 x float> undef, <4 x i32> <i32 > undef, i32 1, i32 2, i32 3> > ret <4 x float> %shuf > } > => > ret %arg > > I agree that we shouldn't be doing this because it transforms undef into > poison. > This would be fine if we used poison to initialize vectors instead of undef. Given a chance to punt on making intermediate/temporary changes, I'll take it. :) Or let me know if we should fix this right away to make progress/prevent miscompiles.
(In reply to Sanjay Patel from comment #7) > (In reply to Nuno Lopes from comment #6) > > Ah, looking again at Eli's example: > > > > define <4 x float> @shuffle(<4 x float> %arg) { > > %shuf = shufflevector <4 x float> %arg, <4 x float> undef, <4 x i32> <i32 > > undef, i32 1, i32 2, i32 3> > > ret <4 x float> %shuf > > } > > => > > ret %arg > > > > I agree that we shouldn't be doing this because it transforms undef into > > poison. > > This would be fine if we used poison to initialize vectors instead of undef. > > Given a chance to punt on making intermediate/temporary changes, I'll take > it. :) > Or let me know if we should fix this right away to make progress/prevent > miscompiles. Fair enough :) This one I would fix because it's possible to get end-to-end miscompilation. It also allows us to gets rid of a bunch of failures in LLVM unit tests when run with Alive2, which allow us to find other problems if they exist. I would ignore the other transformation we reported in this bug report. That one seems safe; we'll run more tests and report back if anything pops up. Thanks!
I think I cc'd everyone's corresponding Phab handle, but just in case: https://reviews.llvm.org/D70246
Code fix: https://reviews.llvm.org/rGf575f12c6465 Doc fix proposal: https://reviews.llvm.org/D70641
Anything left to do here? We clarified the reasoning about shuffle masks and documented it slightly better with: http://lists.llvm.org/pipermail/llvm-dev/2019-November/137243.html https://reviews.llvm.org/rGead0d77409b888ad5b8df90ee3b8ab7639b241c7
Ok, I agree we can close this one. The next issue with shufflevector+undef, which contradicts a bit the conclusion in that LLVMdev thread, is in https://bugs.llvm.org/show_bug.cgi?id=44185