Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InstCombine: incorrect fabs formation #59279

Closed
nunoplopes opened this issue Dec 1, 2022 · 19 comments
Closed

InstCombine: incorrect fabs formation #59279

nunoplopes opened this issue Dec 1, 2022 · 19 comments

Comments

@nunoplopes
Copy link
Member

Test Transforms/InstCombine/fabs.ll shows the following transformation:

define double @select_fcmp_ole_zero(double %x) {
  %lezero = fcmp ole double %x, 0.000000
  %negx = fsub double 0.000000, %x, exceptions=ignore
  %fabs = select i1 %lezero, double %negx, double %x
  ret double %fabs
}
=>
define double @select_fcmp_ole_zero(double %x) {
  %1 = fabs double %x, exceptions=ignore
  ret double %1
}

This is not correct as fabs produces a non-deterministic NaN payload, while the original function preserved the NaN payload when x > 0.

cc @arsenm @jcranmer-intel @jyknight @rotateright

@nunoplopes
Copy link
Member Author

Similar pattern in Transforms/InstCombine/fneg-fabs.ll:

define double @select_nsz_nfabs_ult(double %x) {
  %cmp = fcmp ult double %x, 0.000000
  %negX = fneg double %x
  %sel = select nsz i1 %cmp, double %x, double %negX
  ret double %sel
}
=>
define double @select_nsz_nfabs_ult(double %x) {
  %1 = fabs nsz double %x
  %sel = fneg nsz double %1
  ret double %sel
}

@rotateright
Copy link
Contributor

We don't state it explicitly in the LangRef, but fabs is purely a signbit operation (as is fneg).

So the LLVM 'fabs' conforms to the libm 'fabs' which implements the IEEE-754 abs() that says (in section 5.5.1 of the 2008 version):
"Implementations shall provide the following homogeneous quiet-computational sign bit operations for all supported arithmetic formats; they only affect the sign bit. The operations treat floating-point numbers and NaNs alike, and signal no exception. These operations may propagate non-canonical encodings."

So if we're transforming to fabs, the payload isn't changed?

@jcranmer-intel
Copy link
Contributor

Yes, fabs, fneg, copysign should all touch only the sign bit, preserving NaN payload.

@nunoplopes
Copy link
Member Author

I had it like that, but several optimizations were failing. I'll collect a few so you can decide the semantics you want :)

@nunoplopes
Copy link
Member Author

Ok, I can only see only real failure from InstCombine/fneg.ll:

define float @fnabs_2(float %a) {
  %fneg = fneg float %a
  %cmp = fcmp olt float %a, %fneg
  %sel = select nsz i1 %cmp, float %fneg, float %a
  %fneg1 = fneg float %sel
  ret float %fneg1
}
=>
define float @fnabs_2(float %a) {
  %1 = fabs nsz float %a
  %fneg1 = fneg float %1
  ret float %fneg1
}
Transformation doesn't verify!

ERROR: Value mismatch

Example:
float %a = #xff810000 (NaN)

Source:
float %fneg = #x7f810000 (NaN)
i1 %cmp = #x0 (0)
float %sel = #xff810000 (NaN)
float %fneg1 = #x7f810000 (NaN)

Target:
float %1 = #x7f810000 (NaN)
float %fneg1 = #xff810000 (NaN)
Source value: #x7f810000 (NaN)
Target value: #xff810000 (NaN)

Should we just restrict this to ops with nnan?

FWIW, there are two more cases, but they involve undef, so I think we can ignore them while we quietly work on removing undef from LLVM..

@arsenm
Copy link
Contributor

arsenm commented Dec 2, 2022

Ok, I can only see only real failure from InstCombine/fneg.ll:

Source value: #x7f810000 (NaN)
Target value: #xff810000 (NaN)

The sign bit of a NaN is meaningless, so this isn't wrong

@nunoplopes
Copy link
Member Author

Ok, I've changed fabs/fneg/copysign to return a non-deterministic sign when the input is NaN.
I'm seeing only 2 related failures now:

; InstCombine/fabs.ll
define double @select_fcmp_ole_zero(double %x) {
  %lezero = fcmp ole double %x, 0.000000
  %negx = fsub double 0.000000, %x
  %fabs = select i1 %lezero, double %negx, double %x
  ret double %fabs
}
=>
define double @select_fcmp_ole_zero(double %x) {
  %1 = fabs double %x
  ret double %1
}
Transformation doesn't verify! (unsound)
ERROR: Target's return value is more undefined

Example:
double %x = #xfff0004000000000 (NaN)
i1 %lezero = #x0 (0)

The issue is that the fcmp in source evaluates to false when the input is NaN. That makes the select return the original, unmodified %x.
However, the tgt program returns %x with a non-deterministic sign bit.
So either the transformation is changed to have some nnan attribute, for example. Or it needs to be removed. It's not correct under this semantics of a non-deterministic NaN sign bit (@RalfJung).

@nunoplopes
Copy link
Member Author

The other failure:

; InstCombine/fneg-fabs.ll
define double @select_nsz_nfabs_ult(double %x) {
  %cmp = fcmp ult double %x, 0.000000
  %negX = fneg double %x
  %sel = select nsz i1 %cmp, double %x, double %negX
  ret double %sel
}
=>
define double @select_nsz_nfabs_ult(double %x) {
  %1 = fabs nsz double %x
  %sel = fneg nsz double %1
  ret double %sel
}

@jyknight
Copy link
Member

jyknight commented Dec 5, 2022

The sign bit of a NaN is meaningless, so this isn't wrong

I've changed fabs/fneg/copysign to return a non-deterministic sign when the input is NaN.

No, that's wrong. Yes, the sign is "meaningless", and is effectively arbitrary when resulting from a floating-point operation. But, the sign-manipulation operations are considered bit-manipulation instructions, not as floating-point operations. They affect ONLY the sign bit of the values, and have deterministic outcome even for NaNs -- even for signaling NaN.

Therefore, the fnabs_2 example was indeed incorrect. All of the operations there -- except the compare -- only manipulate sign bits. And olt is fully defined for qNaNs. So, the outcome is entirely deterministic, no wiggle-room available. The transformed version similarly only has sign-manipulation, so no wiggle room there either.

@nunoplopes
Copy link
Member Author

Ok, so the sign bit of fabs should always be 0 irrespective of the input?

I can flip the semantics again in Alive2; I just want you guys to converge first. And maybe to fix some of the optimizations above as they can't all be correct. We (you) have to choose which ones to keep and which ones to restrict to fast-math.

@jyknight
Copy link
Member

jyknight commented Dec 5, 2022

Ok, so the sign bit of fabs should always be 0 irrespective of the input?

Yes. Or, well, at least...that is the behavior specified by IEEE754, (typically) implemented in FP hardware, and specified by WebAssembly.

It's not currently documented precisely in LangRef, but, I think those are the most sensible semantics for LLVM IR as well.

@RalfJung
Copy link
Contributor

RalfJung commented Dec 5, 2022

There's a short list of functions that are specified as behaving "bitwise" and they should probably be modeled as fully deterministic, not clobbering NaN payload and sign bits: copy, negate, abs, copySign. (Source 1, Source 2)

@jcranmer-intel
Copy link
Contributor

The IEEE 754 names for the operations are as WASM calls them; in C terms, the operations are (from F.3) memcpy (and friends) or +(x), -(x), fabs, and copysign. In LLVM, it's llvm.copysign, fneg, and llvm.fabs.

@nunoplopes
Copy link
Member Author

I've changed the semantics in Alive2 again so these operations are strictly bitwise, and there are only 2 failing tests: the fnabs_2 above, and another one (square fabs(x) -> square x) that is correct if undef didn't exist, so let's ignore for now.

rotateright added a commit that referenced this issue Dec 9, 2022
…ct fold

We were conservatively intersecting flags, but we can take the union here
because forbidden special values (nsz/nnan/ninf) are not altered by fneg.
So if they were guaranteed not present on the select or fneg, then they
are guaranteed not present on the new select. Alive2 appears to agree on
the test diffs (reduced to not include irrelevant flags like reassoc):
https://alive2.llvm.org/ce/z/ViqqrO

This prevents a potential regression if we tighten up the FMF behavior
for fabs with NAN as suggested in issue #59279.
@rotateright
Copy link
Contributor

A one-line patch to require 'nnan' on the fabs transform shows many test diffs:
fabs.nnan.patch.txt

If all of the tests besides "fnabs_2" are correct as-is, then there should be some more limited constraint that we can use. But these all seem similar to me. For example, is the 1st changed test in the patch "/llvm/test/Transforms/InstCombine/fabs.ll - @select_fcmp_ogt_fneg" really correct without nnan?

@nunoplopes
Copy link
Member Author

It's also wrong. Online Alive2 is a few days old, so it's not catching it up.

define float @select_fcmp_ogt_fneg(float %a) {
  %fneg = fneg float %a, exceptions=ignore
  %cmp = fcmp ogt float %a, %fneg
  %r = select nsz i1 %cmp, float %a, float %fneg
  ret float %r
}
=>
define float @select_fcmp_ogt_fneg(float %a) {
  %1 = fabs nsz float %a, exceptions=ignore
  ret float %1
}
Transformation doesn't verify!

ERROR: Value mismatch

Example:
float %a = #x7f810000 (NaN)

@rotateright
Copy link
Contributor

Ok - I wasn't imagining the diffs. :)
I'll stamp out copies of the changed tests that include 'nnan', so we have coverage for both variants. Hopefully, the extra 'nnan' constraint isn't meaningful in the real-world (I expect most users that want faster FP use that option).

rotateright added a commit that referenced this issue Dec 10, 2022
The existing variants have "nsz", but that's not enough
to get fabs/fneg semantics right with a NAN input, so
I duplicated those with "nnan" tacked on. See discussion
in issue #59279.
rotateright added a commit that referenced this issue Dec 10, 2022
There was a code comment about detecting min/max, and we were already
doing that later.

The real motivation is hinted at by the new TODO comment. I'm hoping
to untangle some FMF ambiguity in follow-on patches. See discussion
in issue #59279.

There are enough unknowns in FMF handling that I can't say with
certainty that this change is NFC, but it doesn't cause any existing
regression tests to change.
rotateright added a commit that referenced this issue Dec 11, 2022
This is intended to mitigate potential regressions that
would result from restricting this fold for NANs as
discussed in issue #59279.

Ideally, we could do this more generally because we have
known problems seeing/generating FMF on a select, but there
are likely many corner cases that need to verified.

For example, I thought this propagation would be valid
without looking at the condition value and for 'nsz' too,
but according to Alive2, it is not:
https://alive2.llvm.org/ce/z/AnG6As
@rotateright
Copy link
Contributor

rotateright added a commit that referenced this issue Dec 28, 2022
As discussed in issue #59279, we want fneg/fabs to conform to the
IEEE-754 spec for signbit operations - quoting from section 5.5.1
of IEEE-754-2008:
"negate(x) copies a floating-point operand x to a destination in
the same format, reversing the sign bit"
"abs(x) copies a floating-point operand x to a destination in the
same format, setting the sign bit to 0 (positive)"
"The operations treat floating-point numbers and NaNs alike."

So we gate this transform with "nnan" in addition to "nsz":
(X > 0.0) ? X : -X --> fabs(X)

Without that restriction, we could have for example:
(+NaN > 0.0) ? +NaN : -NaN --> -NaN
(because an ordered compare with NaN is always false)
That would be different than fabs(+NaN) --> +NaN.

More fabs/fneg patterns demonstrated here:
https://godbolt.org/z/h8ecc659d
(without any FMF, these are correct independently of this patch -
no fabs should be created)

The code change is a one-liner, but we have lots of tests diffs
because there are many variations of the basic pattern.

Differential Revision: https://reviews.llvm.org/D139785
@rotateright
Copy link
Contributor

I think this problem is solved with:
862e35e

Please re-open or file new bugs if we are still not handling neg/abs as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants