LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 37358 - SIMD Misoptimization during "Promote 'by reference' arguments to scalars on SCC"
Summary: SIMD Misoptimization during "Promote 'by reference' arguments to scalars on SCC"
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Backend: X86 (show other bugs)
Version: trunk
Hardware: PC Windows NT
: P enhancement
Assignee: Tom Stellard
URL:
Keywords:
Depends on:
Blocks: release-7.0.1
  Show dependency tree
 
Reported: 2018-05-07 13:41 PDT by Alex Crichton
Modified: 2019-01-16 13:21 PST (History)
17 users (show)

See Also:
Fixed By Commit(s): r351296


Attachments
failing IR (4.59 KB, text/plain)
2018-05-07 13:41 PDT, Alex Crichton
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alex Crichton 2018-05-07 13:41:35 PDT
Created attachment 20270 [details]
failing IR

We've got an upstream bug in rust-lang/rust at https://github.com/rust-lang/rust/issues/50154 where LLVM at opt-level=3 is mis-optimizing promotion of an argument passed by reference to pass-by-value. The attached IR exhibits the difference by looking at:


    $ opt -O2 tmp.ll -S | grep '^define.*m256'
    define internal fastcc void @_mm256_cmpgt_epi16(<4 x i64>* nocapture, <4 x i64>* nocapture readonly %a, <4 x i64>* nocapture readonly %b) unnamed_addr #2 {
    $ opt -O3 tmp.ll -S | grep '^define.*m256'
    define internal fastcc void @_mm256_cmpgt_epi16(<4 x i64>* nocapture, <4 x i64> %a.val, <4 x i64> %b.val) unnamed_addr #2 {



Note that at opt-level=2 the two arguments to this function continue to be passed by reference, but at opt-level=3 they're promoted to being passed by value. In this situation the target function, `_mm256_cmpgt_epi16`, has the "avx2" feature enabled. The caller, `baseline`, does not have any extra target features enabled (aka doesn't have "avx2" available). This means that if attempting to pass by value this'll be an ABI mismatch at codegen time, producing invalid results on optimized IR.

Using opt-bisect-limit I found that this happens during the "Promote 'by reference' arguments to scalars on SCC" pass. Are we correct in thinking that this optimization shouldn't happen? Or is this a valid optimization that we'll need to work around on rustc's end?
Comment 1 Eli Friedman 2018-05-07 14:48:33 PDT
It's pretty clearly an LLVM bug.  I'm not sure what part of LLVM, though; arguably, the bug is in the x86 backend and the way it chooses a calling convention based on target features.

-----

Related testcase; try with clang --target=i686-pc-freebsd":

__attribute((target("sse2"),noinline)) static float x(float z) { return z+3; }
float y(float z) { return x(z)+3; }

This fails because we're transforming the call to x from the C calling convention to the LLVM-internal "fastcc" calling convention, and the fastcc convention uses SSE registers for arguments depending on the target features.
Comment 2 Gonzalo BG 2018-08-06 01:50:24 PDT
Should this issue be moved to the x86 backend component then ?
Comment 3 Hans Wennborg 2018-08-16 03:08:43 PDT
Any X86 folks on this bug that can take a look? Sounds like this might be nice to get fixed for the release.
Comment 4 Jack Mott 2018-09-01 08:57:16 PDT
This bug appears to also be affect a Rust simd library of mine, preventing it from being used with recursive functions that inline into target_feature functions. Would love to set this fixed.
Comment 5 Chandler Carruth 2018-09-04 03:14:43 PDT
Argument promotion pass needs to consider whether the caller/callee pair attributes make the promotion in question valid. The inliner already has exactly this kind of checks in place for things like target attributes.

The downside is that this check in argument promotion is somewhat more subtle.

However, I don't know that this makes sense to try to fix in time for the release... This bug seems likely to have always existed with LLVM's target attributes. =[

Eric, Craig, and myself are probably the best people to talk about how to fix this long-term.
Comment 6 Hans Wennborg 2018-09-04 06:19:05 PDT
Thanks, sounds like there's not enough traction to wait on this for the release.
Comment 7 Eric Christopher 2018-10-19 13:49:55 PDT
(In reply to Jack Mott from comment #4)
> This bug appears to also be affect a Rust simd library of mine, preventing
> it from being used with recursive functions that inline into target_feature
> functions. Would love to set this fixed.

Inlining with different target feature functions shouldn't happen. There are checks to ensure that they're at least compatible.
Comment 8 Jack Mott 2018-10-19 13:59:28 PDT
(In reply to Eric Christopher from comment #7)
> (In reply to Jack Mott from comment #4)
> > This bug appears to also be affect a Rust simd library of mine, preventing
> > it from being used with recursive functions that inline into target_feature
> > functions. Would love to set this fixed.
> 
> Inlining with different target feature functions shouldn't happen. There are
> checks to ensure that they're at least compatible.


It is a generic function in Rust, which at Rust compile time becomes either an SSE2 or AVX2 function, and is then inlined into appropriate target_feature tagged functions.  But when the function is recursive this breaks down. Alex Crichton took at look at the IL and thought this was likely the cause.
Comment 9 Eric Christopher 2018-10-19 14:16:43 PDT
(In reply to Jack Mott from comment #8)
> (In reply to Eric Christopher from comment #7)
> > (In reply to Jack Mott from comment #4)
> > > This bug appears to also be affect a Rust simd library of mine, preventing
> > > it from being used with recursive functions that inline into target_feature
> > > functions. Would love to set this fixed.
> > 
> > Inlining with different target feature functions shouldn't happen. There are
> > checks to ensure that they're at least compatible.
> 
> 
> It is a generic function in Rust, which at Rust compile time becomes either
> an SSE2 or AVX2 function, and is then inlined into appropriate
> target_feature tagged functions.  But when the function is recursive this
> breaks down. Alex Crichton took at look at the IL and thought this was
> likely the cause.

If it's an inlining failure it's unrelated to this. You might want to open a different bug with a stripped down testcase there.
Comment 10 Eric Christopher 2018-10-19 14:32:23 PDT
(In reply to Chandler Carruth from comment #5)
> Argument promotion pass needs to consider whether the caller/callee pair
> attributes make the promotion in question valid. The inliner already has
> exactly this kind of checks in place for things like target attributes.
> 
> The downside is that this check in argument promotion is somewhat more
> subtle.
> 
> However, I don't know that this makes sense to try to fix in time for the
> release... This bug seems likely to have always existed with LLVM's target
> attributes. =[
> 
> Eric, Craig, and myself are probably the best people to talk about how to
> fix this long-term.

In general fixing this is going to require argument promotion to know ABI requirements from the back end in the presence of target attributes. This is probably going to be a little tricky - though I'm surprised that this hasn't caused problems before now.

Right now the easiest mechanism is going to be a bit heavyweight: disable arg promotion in the face of conflicting target attributes even more extreme than the inlining code in that we don't want it to happen at all. Then each target can add ABI knowledge into how/whether we want arguments to be promoted after that.
Comment 11 Nikita Popov 2018-10-21 01:48:01 PDT
For the record, rustc now works around this problem by manually undoing argument promotion as a custom pass, see https://github.com/rust-lang/rust/pull/55073.
Comment 12 Tom Stellard 2018-10-22 11:24:58 PDT
Is anyone working on a fix for this?  If not, I will give it a try.
Comment 13 Eric Christopher 2018-10-22 14:07:24 PDT
I'm not, the advice I gave is probably how I'd go. I'm happy to look if you end up implementing it.
Comment 14 Tom Stellard 2018-10-22 21:40:10 PDT
Proposed Fix: https://reviews.llvm.org/D53554
Comment 15 Ben Striegel 2018-10-24 13:08:50 PDT
For those following along with the rustc side of things, Nikita's prior comment is now inaccurate as #55073 has been reverted: https://github.com/rust-lang/rust/pull/55281 . The sentiment now seems to be to just wait for an LLVM fix, and it's heartening to see that a patch is already being considered.
Comment 16 Tom Stellard 2019-01-16 12:22:52 PST
A fix has been merged in r351296, can you verify that this is fixed?
Comment 17 Alex Crichton 2019-01-16 13:21:23 PST
Thanks so much for landing the fix! We're updating rust-lang/rust in https://github.com/rust-lang/rust/pull/57675 and I'll start running some tests with that once it's in-tree. In the meantime I'll go head and close this as resolved, and I'll follow-up with more issues if they crop up, but I suspect we should be good to go!