New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure to simplify -min(-a, a) => max(-a, a) #38828
Comments
That's a hard way to write 'std::abs': %negx = sub i32 0, %x |
-min(-x, x) can be generalized to -min(-x, y) => max(x, -y) (https://rise4fun.com/Alive/FBpl) but requires nsw at least on the min negation. The opposite case only works for y = x: -max(-x, x) => max(-x, x) (https://rise4fun.com/Alive/FDjL) but works for any combination of nsw flags. |
Not sure about the nsw/nuw flags, but according to Alive this is correct too: ; -min(-x, -y) => max(x, y) %negx = sub nsw 0, %x I've noticed this while looking at the codegen of functions in llvm/test/CodeGen/X86/vector-reduce-umax.ll Specifically, the sequence: ; X64-SSE2-NEXT: movdqa {{.*#+}} xmm2 = [32768,32768,32768,32768,32768,32768,32768,32768] Which is basically doing a And could be rewritten as a simple ; X64-SSE2-NEXT: pmaxsw %xmm0, %xmm1 |
Acutally sorry, the original code from that test was trying to do float negations (doing a xor of the sign bit). Anyway, this is the Alive link: https://rise4fun.com/Alive/5Y7 |
Note that -min(-x, y) => max(x, -y) is more general than -min(-x, -y) => max(x, y) it just requires an extra -(-y) => y step. |
Nowadays with the help of Negator we get: define dso_local i32 @_Z11min_neg_s32i(i32 %0) local_unnamed_addr #0 { So i think the transform we are missing is Name: (-x) s> x --> x s< 0 Name: (-x) s>= x --> x s< 1
https://rise4fun.com/Alive/moSi Ignoring the question of intrinsics, i can take a look |
Added folds for |
... but don't handle the vector case, since it lacks NSW. |
Hm, i lost track here. Let me revisit this... |
There are multiple examples here, so I'm not sure what's left either. On the FP side, we probably just need to hoist an fneg through a select (fneg is a unary op, so it misses the binop transforms we would try with a select operand): We want to make sure that FMF get propagated through those transforms, so it needs a pile of tests. Let me know if I should work on that part. |
And once again, i'm not sure what specific transform we are missing?
|
Posted too soon. @Sanjay yes, i don't intend to touch FP side of things, feel free to. |
Ok, i'm seeing one missed Negator pattern: |
FP neg fold: I think that gives us the optimal code for the example in the original description: float min_neg_f32(float a) --> This isn't fabs because of -0.0, but we miss that too even with -ffast-math, so that's another bug... |
Filled in another missing FP min/max transform: |
AFAICT the integer min/max -> abs folds are now complete - @rotateright are there any more fabs patterns that you think should be handled on this ticket or can we close it? |
Let's close this one since it was mostly about the integer patterns. I have some notes about the missing FP transforms. I'll open new issues for those or just post patches. |
This can be viewed as swapping the select arms: https://alive2.llvm.org/ce/z/jUvFMJ ...so we don't have the 'nsz' problem with the more general fold. This unlocks other folds for the motivating fabs example. This was discussed in issue #38828.
This inverts a fold recently added to IR with: 3491f2f We can put -bidirectional on the Alive2 examples to show that the reverse transforms work: https://alive2.llvm.org/ce/z/8iVQwB The motivation for the IR change was to improve matching to 'fabs' in IR (see #38828 ), but it regressed x86 codegen for 'not-quite-fabs' patterns like (X > -X) ? X : -X. Ie, when there is no fast-math (nsz), the cmp+select is not a proper fabs operation, but it does map nicely to the unusual NAN semantics of MINSS/MAXSS. I drafted this as a target-independent fold, but it doesn't appear to help any other targets and seems to cause regressions for SystemZ at least. Differential Revision: https://reviews.llvm.org/D122726
This inverts a fold recently added to IR with: 3491f2f4b033 We can put -bidirectional on the Alive2 examples to show that the reverse transforms work: https://alive2.llvm.org/ce/z/8iVQwB The motivation for the IR change was to improve matching to 'fabs' in IR (see llvm/llvm-project#38828 ), but it regressed x86 codegen for 'not-quite-fabs' patterns like (X > -X) ? X : -X. Ie, when there is no fast-math (nsz), the cmp+select is not a proper fabs operation, but it does map nicely to the unusual NAN semantics of MINSS/MAXSS. I drafted this as a target-independent fold, but it doesn't appear to help any other targets and seems to cause regressions for SystemZ at least. Differential Revision: https://reviews.llvm.org/D122726
Extended Description
https://godbolt.org/z/h4zTcS
clang:
gcc manages the scalar cases (s32 is interesting....) but not the vector case:
The text was updated successfully, but these errors were encountered: