New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
InstCombine: incorrect fabs formation #59279
Comments
Similar pattern in define double @select_nsz_nfabs_ult(double %x) {
%cmp = fcmp ult double %x, 0.000000
%negX = fneg double %x
%sel = select nsz i1 %cmp, double %x, double %negX
ret double %sel
}
=>
define double @select_nsz_nfabs_ult(double %x) {
%1 = fabs nsz double %x
%sel = fneg nsz double %1
ret double %sel
} |
We don't state it explicitly in the LangRef, but fabs is purely a signbit operation (as is fneg). So the LLVM 'fabs' conforms to the libm 'fabs' which implements the IEEE-754 abs() that says (in section 5.5.1 of the 2008 version): So if we're transforming to fabs, the payload isn't changed? |
Yes, |
I had it like that, but several optimizations were failing. I'll collect a few so you can decide the semantics you want :) |
Ok, I can only see only real failure from define float @fnabs_2(float %a) {
%fneg = fneg float %a
%cmp = fcmp olt float %a, %fneg
%sel = select nsz i1 %cmp, float %fneg, float %a
%fneg1 = fneg float %sel
ret float %fneg1
}
=>
define float @fnabs_2(float %a) {
%1 = fabs nsz float %a
%fneg1 = fneg float %1
ret float %fneg1
}
Transformation doesn't verify!
ERROR: Value mismatch
Example:
float %a = #xff810000 (NaN)
Source:
float %fneg = #x7f810000 (NaN)
i1 %cmp = #x0 (0)
float %sel = #xff810000 (NaN)
float %fneg1 = #x7f810000 (NaN)
Target:
float %1 = #x7f810000 (NaN)
float %fneg1 = #xff810000 (NaN)
Source value: #x7f810000 (NaN)
Target value: #xff810000 (NaN) Should we just restrict this to ops with FWIW, there are two more cases, but they involve undef, so I think we can ignore them while we quietly work on removing undef from LLVM.. |
The sign bit of a NaN is meaningless, so this isn't wrong |
Ok, I've changed fabs/fneg/copysign to return a non-deterministic sign when the input is NaN. ; InstCombine/fabs.ll
define double @select_fcmp_ole_zero(double %x) {
%lezero = fcmp ole double %x, 0.000000
%negx = fsub double 0.000000, %x
%fabs = select i1 %lezero, double %negx, double %x
ret double %fabs
}
=>
define double @select_fcmp_ole_zero(double %x) {
%1 = fabs double %x
ret double %1
}
Transformation doesn't verify! (unsound)
ERROR: Target's return value is more undefined
Example:
double %x = #xfff0004000000000 (NaN)
i1 %lezero = #x0 (0) The issue is that the fcmp in source evaluates to false when the input is NaN. That makes the select return the original, unmodified %x. |
The other failure: ; InstCombine/fneg-fabs.ll
define double @select_nsz_nfabs_ult(double %x) {
%cmp = fcmp ult double %x, 0.000000
%negX = fneg double %x
%sel = select nsz i1 %cmp, double %x, double %negX
ret double %sel
}
=>
define double @select_nsz_nfabs_ult(double %x) {
%1 = fabs nsz double %x
%sel = fneg nsz double %1
ret double %sel
} |
No, that's wrong. Yes, the sign is "meaningless", and is effectively arbitrary when resulting from a floating-point operation. But, the sign-manipulation operations are considered bit-manipulation instructions, not as floating-point operations. They affect ONLY the sign bit of the values, and have deterministic outcome even for NaNs -- even for signaling NaN. Therefore, the fnabs_2 example was indeed incorrect. All of the operations there -- except the compare -- only manipulate sign bits. And olt is fully defined for qNaNs. So, the outcome is entirely deterministic, no wiggle-room available. The transformed version similarly only has sign-manipulation, so no wiggle room there either. |
Ok, so the sign bit of fabs should always be 0 irrespective of the input? I can flip the semantics again in Alive2; I just want you guys to converge first. And maybe to fix some of the optimizations above as they can't all be correct. We (you) have to choose which ones to keep and which ones to restrict to fast-math. |
Yes. Or, well, at least...that is the behavior specified by IEEE754, (typically) implemented in FP hardware, and specified by WebAssembly. It's not currently documented precisely in LangRef, but, I think those are the most sensible semantics for LLVM IR as well. |
The IEEE 754 names for the operations are as WASM calls them; in C terms, the operations are (from F.3) |
I've changed the semantics in Alive2 again so these operations are strictly bitwise, and there are only 2 failing tests: the fnabs_2 above, and another one (square fabs(x) -> square x) that is correct if undef didn't exist, so let's ignore for now. |
…ct fold We were conservatively intersecting flags, but we can take the union here because forbidden special values (nsz/nnan/ninf) are not altered by fneg. So if they were guaranteed not present on the select or fneg, then they are guaranteed not present on the new select. Alive2 appears to agree on the test diffs (reduced to not include irrelevant flags like reassoc): https://alive2.llvm.org/ce/z/ViqqrO This prevents a potential regression if we tighten up the FMF behavior for fabs with NAN as suggested in issue #59279.
A one-line patch to require 'nnan' on the fabs transform shows many test diffs: If all of the tests besides "fnabs_2" are correct as-is, then there should be some more limited constraint that we can use. But these all seem similar to me. For example, is the 1st changed test in the patch "/llvm/test/Transforms/InstCombine/fabs.ll - @select_fcmp_ogt_fneg" really correct without nnan? |
It's also wrong. Online Alive2 is a few days old, so it's not catching it up. define float @select_fcmp_ogt_fneg(float %a) {
%fneg = fneg float %a, exceptions=ignore
%cmp = fcmp ogt float %a, %fneg
%r = select nsz i1 %cmp, float %a, float %fneg
ret float %r
}
=>
define float @select_fcmp_ogt_fneg(float %a) {
%1 = fabs nsz float %a, exceptions=ignore
ret float %1
}
Transformation doesn't verify!
ERROR: Value mismatch
Example:
float %a = #x7f810000 (NaN) |
Ok - I wasn't imagining the diffs. :) |
The existing variants have "nsz", but that's not enough to get fabs/fneg semantics right with a NAN input, so I duplicated those with "nnan" tacked on. See discussion in issue #59279.
There was a code comment about detecting min/max, and we were already doing that later. The real motivation is hinted at by the new TODO comment. I'm hoping to untangle some FMF ambiguity in follow-on patches. See discussion in issue #59279. There are enough unknowns in FMF handling that I can't say with certainty that this change is NFC, but it doesn't cause any existing regression tests to change.
This is intended to mitigate potential regressions that would result from restricting this fold for NANs as discussed in issue #59279. Ideally, we could do this more generally because we have known problems seeing/generating FMF on a select, but there are likely many corner cases that need to verified. For example, I thought this propagation would be valid without looking at the condition value and for 'nsz' too, but according to Alive2, it is not: https://alive2.llvm.org/ce/z/AnG6As
As discussed in issue #59279, we want fneg/fabs to conform to the IEEE-754 spec for signbit operations - quoting from section 5.5.1 of IEEE-754-2008: "negate(x) copies a floating-point operand x to a destination in the same format, reversing the sign bit" "abs(x) copies a floating-point operand x to a destination in the same format, setting the sign bit to 0 (positive)" "The operations treat floating-point numbers and NaNs alike." So we gate this transform with "nnan" in addition to "nsz": (X > 0.0) ? X : -X --> fabs(X) Without that restriction, we could have for example: (+NaN > 0.0) ? +NaN : -NaN --> -NaN (because an ordered compare with NaN is always false) That would be different than fabs(+NaN) --> +NaN. More fabs/fneg patterns demonstrated here: https://godbolt.org/z/h8ecc659d (without any FMF, these are correct independently of this patch - no fabs should be created) The code change is a one-liner, but we have lots of tests diffs because there are many variations of the basic pattern. Differential Revision: https://reviews.llvm.org/D139785
I think this problem is solved with: Please re-open or file new bugs if we are still not handling neg/abs as expected. |
Test
Transforms/InstCombine/fabs.ll
shows the following transformation:This is not correct as
fabs
produces a non-deterministic NaN payload, while the original function preserved the NaN payload when x > 0.cc @arsenm @jcranmer-intel @jyknight @rotateright
The text was updated successfully, but these errors were encountered: