New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[X86] Failure to pull out common scaled address offset through select/cmov #50413
Comments
So I think we can perform this in InstCombine:
-->
so I think we're trying to fold: select(cond, gep(gep(ptr, idx0), idx1), gep(ptr, idx0)) |
This is just a special case of [Bug #50183] / D105901 |
Candidate Patch: https://reviews.llvm.org/D106352 |
Alternative Patch: https://reviews.llvm.org/D106450 |
Current Codegen:
We can probably improve this further with: %cond.idx = select i1 %tobool.not, i64 6, i64 0 --> %cond.idx9 = add i64 %cond.idx, %offset |
I'm guessing you meant
? |
That's correct in general, but for x86 with cmov, there's an opportunity to reduce constant materialization (since cmov needs register operands). See if this looks right: |
Yes - sorry, hot weather (for the UK at least..) has melted my brain and destroyed my copy+paste skills :-( |
Not sure if this example suggests a different approach, but we added it with https://reviews.llvm.org/D106684 :
And the suggested improvement to decompose the complex LEA: |
We have three x86 codegen folds so far based on this: And the example here is based on a sequence in a Bullet benchmark where GCC was doing better: Do we know if we made a dent in that deficit? I'm not sure how to get that benchmark built and running locally. |
There are some instructions how to build it here: https://openbenchmarking.org/innhold/99d3a8c1ea3ea71e1edf4aea6bf9af30100f07d5 |
Extended Description
https://simd.godbolt.org/z/qsKWW1heG
We can reduce the number of complex LEA ops by pulling the common "%rsi,4" scaled offset through the cmov and into the addl address:
This might be generally achievable by canonicalizing the gep-chain, but for now I'm making this a X86 ticket.
The text was updated successfully, but these errors were encountered: