LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 44185 - Can't remove shufflevector if input might be poison
Summary: Can't remove shufflevector if input might be poison
Status: NEW
Alias: None
Product: libraries
Classification: Unclassified
Component: Scalar Optimizations (show other bugs)
Version: trunk
Hardware: All All
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords: miscompilation
Depends on:
Blocks: 47948
  Show dependency tree
 
Reported: 2019-11-30 07:20 PST by Nuno Lopes
Modified: 2021-09-27 03:24 PDT (History)
11 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nuno Lopes 2019-11-30 07:20:30 PST
Test case from Transforms/InstCombine/insert-extract-shuffle.ll:

define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) {
  %xv = insertelement <4 x float> %q, float %x, i32 2
  %r = shufflevector <4 x float> %y, <4 x float> %xv, <4 x i32> { 0, 6, 2, undef }
  ret <4 x float> %r
}
=>
define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) {
  %r = insertelement <4 x float> %y, float %x, i32 1
  ret <4 x float> %r
}
Transformation doesn't verify!
ERROR: Target is more poisonous than source

Example:
float %x = poison
<4 x float> %y = < poison, poison, poison, poison >
<4 x float> %q = < poison, poison, poison, poison >

Source:
<4 x float> %xv = < poison, poison, poison, poison >
<4 x float> %r = < poison, poison, poison, undef >

Target:
<4 x float> %r = < poison, poison, poison, poison >
Source value: < poison, poison, poison, undef >
Target value: < poison, poison, poison, poison >


This is with the semantics of LangRef, where an undef mask stops propagation of poison. This implies this kind of shufflevectors can't be removed.
Comment 1 Nuno Lopes 2019-11-30 07:22:08 PST
Similar optimization in Transforms/InstCombine/shuffle_select.ll:

define <4 x i32> @add_undef_mask_elt(<4 x i32> %v) {
  %b = add <4 x i32> %v, { 11, 12, 13, 14 }
  %s = shufflevector <4 x i32> %b, <4 x i32> %v, <4 x i32> { 0, 5, undef, 7 }
  ret <4 x i32> %s
}
=>
define <4 x i32> @add_undef_mask_elt(<4 x i32> %v) {
  %s = add <4 x i32> %v, { 11, 0, undef, 0 }
  ret <4 x i32> %s
}
Transformation doesn't verify!
ERROR: Target is more poisonous than source

Example:
<4 x i32> %v = < poison, poison, poison, poison >

Source:
<4 x i32> %b = < poison, poison, poison, poison >
<4 x i32> %s = < poison, poison, undef, poison >

Target:
<4 x i32> %s = < poison, poison, poison, poison >
Source value: < poison, poison, undef, poison >
Target value: < poison, poison, poison, poison >
Comment 2 Hal Finkel 2019-12-01 06:36:51 PST
> This is with the semantics of LangRef, where an undef mask stops propagation of poison.

This seems unfortunate. Can you please remind me why we choose these semantics? It seems to me that (add undef, poison) should be poison, not undef, no?
Comment 3 Nuno Lopes 2019-12-02 01:59:00 PST
(In reply to Hal Finkel from comment #2)
> > This is with the semantics of LangRef, where an undef mask stops propagation of poison.
> 
> This seems unfortunate. Can you please remind me why we choose these
> semantics? It seems to me that (add undef, poison) should be poison, not
> undef, no?

See the discussion here: https://bugs.llvm.org/show_bug.cgi?id=43958

Short version, this semantics is needed if we want to support things introduction of broadcast:

  %t = insertelement <4 x float> undef, float %arg, i32 1
  %t4 = insertelement <4 x float> %t, float %arg, i32 1
  %t5 = insertelement <4 x float> %t4, float %arg, i32 2
  %t6 = insertelement <4 x float> %t5, float %arg, i32 3
  ret <4 x float> %t6
=>
  %t = insertelement <4 x float> undef, float %arg, i32 0
  %t2 = shufflevector <4 x float> %t, <4 x float> undef, <4 x i32> <i32 undef, i32 0, i32 0, i32 0>
   ret <4 x float> %t2


IMHO, the correct fix is to introduce a poison value and use that to initialize vectors. Also, a poison mask would give poison, so no more of these problems of choosing "undef or poison -> undef". Using shufflevector to stop poison isn't ideal, since then we can't remove it.
BTW, I've shown in #43958 that these bugs are exploitable for end-to-end miscompilations.
Comment 4 Hal Finkel 2019-12-02 08:12:47 PST
(In reply to Nuno Lopes from comment #3)
> (In reply to Hal Finkel from comment #2)
> > > This is with the semantics of LangRef, where an undef mask stops propagation of poison.
> > 
> > This seems unfortunate. Can you please remind me why we choose these
> > semantics? It seems to me that (add undef, poison) should be poison, not
> > undef, no?
> 
> See the discussion here: https://bugs.llvm.org/show_bug.cgi?id=43958
> 
> Short version, this semantics is needed if we want to support things
> introduction of broadcast:
> 
>   %t = insertelement <4 x float> undef, float %arg, i32 1
>   %t4 = insertelement <4 x float> %t, float %arg, i32 1
>   %t5 = insertelement <4 x float> %t4, float %arg, i32 2
>   %t6 = insertelement <4 x float> %t5, float %arg, i32 3
>   ret <4 x float> %t6
> =>
>   %t = insertelement <4 x float> undef, float %arg, i32 0
>   %t2 = shufflevector <4 x float> %t, <4 x float> undef, <4 x i32> <i32
> undef, i32 0, i32 0, i32 0>
>    ret <4 x float> %t2
> 
> 
> IMHO, the correct fix is to introduce a poison value and use that to
> initialize vectors. Also, a poison mask would give poison, so no more of
> these problems of choosing "undef or poison -> undef".


That makes sense to me. I expect that in many cases where we use an 'undef' constant, a poison constant would be a better fit. It's the "I need to put a value here, but the program shouldn't ever actually depend on it" value, and at least informally, that's how undef has often been used (as opposed to "the program can depend on it, and it's arbitrary, but that won't matter to the user" undef semantics).

> Using shufflevector
> to stop poison isn't ideal, since then we can't remove it.
> BTW, I've shown in #43958 that these bugs are exploitable for end-to-end
> miscompilations.
Comment 5 Amara Emerson 2020-12-29 17:46:52 PST
> It's the "I need to put a value here, but the program shouldn't ever actually depend on it" value, and at least informally, that's how undef has often been used (as opposed to "the program can depend on it, and it's arbitrary, but that won't matter to the user" undef semantics).

This is a nice succinct explanation of the difference between the two, that seems to be missing from the langref's explanation of the constants.