InstCombine: incorrect fabs formation #59279

nunoplopes · 2022-12-01T09:03:59Z

Test Transforms/InstCombine/fabs.ll shows the following transformation:

define double @select_fcmp_ole_zero(double %x) {
  %lezero = fcmp ole double %x, 0.000000
  %negx = fsub double 0.000000, %x, exceptions=ignore
  %fabs = select i1 %lezero, double %negx, double %x
  ret double %fabs
}
=>
define double @select_fcmp_ole_zero(double %x) {
  %1 = fabs double %x, exceptions=ignore
  ret double %1
}

This is not correct as fabs produces a non-deterministic NaN payload, while the original function preserved the NaN payload when x > 0.

cc @arsenm @jcranmer-intel @jyknight @rotateright

The text was updated successfully, but these errors were encountered:

nunoplopes · 2022-12-01T09:04:55Z

Similar pattern in Transforms/InstCombine/fneg-fabs.ll:

define double @select_nsz_nfabs_ult(double %x) {
  %cmp = fcmp ult double %x, 0.000000
  %negX = fneg double %x
  %sel = select nsz i1 %cmp, double %x, double %negX
  ret double %sel
}
=>
define double @select_nsz_nfabs_ult(double %x) {
  %1 = fabs nsz double %x
  %sel = fneg nsz double %1
  ret double %sel
}

rotateright · 2022-12-01T16:40:45Z

We don't state it explicitly in the LangRef, but fabs is purely a signbit operation (as is fneg).

So the LLVM 'fabs' conforms to the libm 'fabs' which implements the IEEE-754 abs() that says (in section 5.5.1 of the 2008 version):
"Implementations shall provide the following homogeneous quiet-computational sign bit operations for all supported arithmetic formats; they only affect the sign bit. The operations treat floating-point numbers and NaNs alike, and signal no exception. These operations may propagate non-canonical encodings."

So if we're transforming to fabs, the payload isn't changed?

jcranmer-intel · 2022-12-01T16:56:22Z

Yes, fabs, fneg, copysign should all touch only the sign bit, preserving NaN payload.

nunoplopes · 2022-12-01T17:01:07Z

I had it like that, but several optimizations were failing. I'll collect a few so you can decide the semantics you want :)

nunoplopes · 2022-12-02T14:52:41Z

Ok, I can only see only real failure from InstCombine/fneg.ll:

define float @fnabs_2(float %a) {
  %fneg = fneg float %a
  %cmp = fcmp olt float %a, %fneg
  %sel = select nsz i1 %cmp, float %fneg, float %a
  %fneg1 = fneg float %sel
  ret float %fneg1
}
=>
define float @fnabs_2(float %a) {
  %1 = fabs nsz float %a
  %fneg1 = fneg float %1
  ret float %fneg1
}
Transformation doesn't verify!

ERROR: Value mismatch

Example:
float %a = #xff810000 (NaN)

Source:
float %fneg = #x7f810000 (NaN)
i1 %cmp = #x0 (0)
float %sel = #xff810000 (NaN)
float %fneg1 = #x7f810000 (NaN)

Target:
float %1 = #x7f810000 (NaN)
float %fneg1 = #xff810000 (NaN)
Source value: #x7f810000 (NaN)
Target value: #xff810000 (NaN)

Should we just restrict this to ops with nnan?

FWIW, there are two more cases, but they involve undef, so I think we can ignore them while we quietly work on removing undef from LLVM..

arsenm · 2022-12-02T19:43:21Z

Ok, I can only see only real failure from InstCombine/fneg.ll:

Source value: #x7f810000 (NaN)
Target value: #xff810000 (NaN)

The sign bit of a NaN is meaningless, so this isn't wrong

nunoplopes · 2022-12-04T10:41:14Z

Ok, I've changed fabs/fneg/copysign to return a non-deterministic sign when the input is NaN.
I'm seeing only 2 related failures now:

; InstCombine/fabs.ll
define double @select_fcmp_ole_zero(double %x) {
  %lezero = fcmp ole double %x, 0.000000
  %negx = fsub double 0.000000, %x
  %fabs = select i1 %lezero, double %negx, double %x
  ret double %fabs
}
=>
define double @select_fcmp_ole_zero(double %x) {
  %1 = fabs double %x
  ret double %1
}
Transformation doesn't verify! (unsound)
ERROR: Target's return value is more undefined

Example:
double %x = #xfff0004000000000 (NaN)
i1 %lezero = #x0 (0)

The issue is that the fcmp in source evaluates to false when the input is NaN. That makes the select return the original, unmodified %x.
However, the tgt program returns %x with a non-deterministic sign bit.
So either the transformation is changed to have some nnan attribute, for example. Or it needs to be removed. It's not correct under this semantics of a non-deterministic NaN sign bit (@RalfJung).

nunoplopes · 2022-12-04T10:42:11Z

The other failure:

; InstCombine/fneg-fabs.ll
define double @select_nsz_nfabs_ult(double %x) {
  %cmp = fcmp ult double %x, 0.000000
  %negX = fneg double %x
  %sel = select nsz i1 %cmp, double %x, double %negX
  ret double %sel
}
=>
define double @select_nsz_nfabs_ult(double %x) {
  %1 = fabs nsz double %x
  %sel = fneg nsz double %1
  ret double %sel
}

jyknight · 2022-12-05T15:07:08Z

The sign bit of a NaN is meaningless, so this isn't wrong

I've changed fabs/fneg/copysign to return a non-deterministic sign when the input is NaN.

No, that's wrong. Yes, the sign is "meaningless", and is effectively arbitrary when resulting from a floating-point operation. But, the sign-manipulation operations are considered bit-manipulation instructions, not as floating-point operations. They affect ONLY the sign bit of the values, and have deterministic outcome even for NaNs -- even for signaling NaN.

Therefore, the fnabs_2 example was indeed incorrect. All of the operations there -- except the compare -- only manipulate sign bits. And olt is fully defined for qNaNs. So, the outcome is entirely deterministic, no wiggle-room available. The transformed version similarly only has sign-manipulation, so no wiggle room there either.

nunoplopes · 2022-12-05T15:40:01Z

Ok, so the sign bit of fabs should always be 0 irrespective of the input?

I can flip the semantics again in Alive2; I just want you guys to converge first. And maybe to fix some of the optimizations above as they can't all be correct. We (you) have to choose which ones to keep and which ones to restrict to fast-math.

jyknight · 2022-12-05T16:22:55Z

Ok, so the sign bit of fabs should always be 0 irrespective of the input?

Yes. Or, well, at least...that is the behavior specified by IEEE754, (typically) implemented in FP hardware, and specified by WebAssembly.

It's not currently documented precisely in LangRef, but, I think those are the most sensible semantics for LLVM IR as well.

RalfJung · 2022-12-05T16:56:21Z

There's a short list of functions that are specified as behaving "bitwise" and they should probably be modeled as fully deterministic, not clobbering NaN payload and sign bits: copy, negate, abs, copySign. (Source 1, Source 2)

jcranmer-intel · 2022-12-05T17:29:46Z

The IEEE 754 names for the operations are as WASM calls them; in C terms, the operations are (from F.3) memcpy (and friends) or +(x), -(x), fabs, and copysign. In LLVM, it's llvm.copysign, fneg, and llvm.fabs.

nunoplopes · 2022-12-06T08:30:03Z

I've changed the semantics in Alive2 again so these operations are strictly bitwise, and there are only 2 failing tests: the fnabs_2 above, and another one (square fabs(x) -> square x) that is correct if undef didn't exist, so let's ignore for now.

…ct fold We were conservatively intersecting flags, but we can take the union here because forbidden special values (nsz/nnan/ninf) are not altered by fneg. So if they were guaranteed not present on the select or fneg, then they are guaranteed not present on the new select. Alive2 appears to agree on the test diffs (reduced to not include irrelevant flags like reassoc): https://alive2.llvm.org/ce/z/ViqqrO This prevents a potential regression if we tighten up the FMF behavior for fabs with NAN as suggested in issue #59279.

rotateright · 2022-12-09T17:51:30Z

A one-line patch to require 'nnan' on the fabs transform shows many test diffs:
fabs.nnan.patch.txt

If all of the tests besides "fnabs_2" are correct as-is, then there should be some more limited constraint that we can use. But these all seem similar to me. For example, is the 1st changed test in the patch "/llvm/test/Transforms/InstCombine/fabs.ll - @select_fcmp_ogt_fneg" really correct without nnan?

nunoplopes · 2022-12-09T18:01:27Z

It's also wrong. Online Alive2 is a few days old, so it's not catching it up.

define float @select_fcmp_ogt_fneg(float %a) {
  %fneg = fneg float %a, exceptions=ignore
  %cmp = fcmp ogt float %a, %fneg
  %r = select nsz i1 %cmp, float %a, float %fneg
  ret float %r
}
=>
define float @select_fcmp_ogt_fneg(float %a) {
  %1 = fabs nsz float %a, exceptions=ignore
  ret float %1
}
Transformation doesn't verify!

ERROR: Value mismatch

Example:
float %a = #x7f810000 (NaN)

rotateright · 2022-12-09T18:12:52Z

Ok - I wasn't imagining the diffs. :)
I'll stamp out copies of the changed tests that include 'nnan', so we have coverage for both variants. Hopefully, the extra 'nnan' constraint isn't meaningful in the real-world (I expect most users that want faster FP use that option).

The existing variants have "nsz", but that's not enough to get fabs/fneg semantics right with a NAN input, so I duplicated those with "nnan" tacked on. See discussion in issue #59279.

There was a code comment about detecting min/max, and we were already doing that later. The real motivation is hinted at by the new TODO comment. I'm hoping to untangle some FMF ambiguity in follow-on patches. See discussion in issue #59279. There are enough unknowns in FMF handling that I can't say with certainty that this change is NFC, but it doesn't cause any existing regression tests to change.

This is intended to mitigate potential regressions that would result from restricting this fold for NANs as discussed in issue #59279. Ideally, we could do this more generally because we have known problems seeing/generating FMF on a select, but there are likely many corner cases that need to verified. For example, I thought this propagation would be valid without looking at the condition value and for 'nsz' too, but according to Alive2, it is not: https://alive2.llvm.org/ce/z/AnG6As

rotateright · 2022-12-11T15:13:50Z

https://reviews.llvm.org/D139785

As discussed in issue #59279, we want fneg/fabs to conform to the IEEE-754 spec for signbit operations - quoting from section 5.5.1 of IEEE-754-2008: "negate(x) copies a floating-point operand x to a destination in the same format, reversing the sign bit" "abs(x) copies a floating-point operand x to a destination in the same format, setting the sign bit to 0 (positive)" "The operations treat floating-point numbers and NaNs alike." So we gate this transform with "nnan" in addition to "nsz": (X > 0.0) ? X : -X --> fabs(X) Without that restriction, we could have for example: (+NaN > 0.0) ? +NaN : -NaN --> -NaN (because an ordered compare with NaN is always false) That would be different than fabs(+NaN) --> +NaN. More fabs/fneg patterns demonstrated here: https://godbolt.org/z/h8ecc659d (without any FMF, these are correct independently of this patch - no fabs should be created) The code change is a one-liner, but we have lots of tests diffs because there are many variations of the basic pattern. Differential Revision: https://reviews.llvm.org/D139785

rotateright · 2022-12-28T19:34:01Z

I think this problem is solved with:
862e35e

Please re-open or file new bugs if we are still not handling neg/abs as expected.

nunoplopes added miscompilation llvm:instcombine labels Dec 1, 2022

nunoplopes mentioned this issue Dec 5, 2022

InstCombine: incorrect copysign formation #59280

Closed

rotateright mentioned this issue Dec 14, 2022

Bad codegen after 8adfa29706e (NaN constant folding weirdness) #59122

Closed

rotateright closed this as completed Dec 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

InstCombine: incorrect fabs formation #59279

InstCombine: incorrect fabs formation #59279

nunoplopes commented Dec 1, 2022

nunoplopes commented Dec 1, 2022

rotateright commented Dec 1, 2022

jcranmer-intel commented Dec 1, 2022

nunoplopes commented Dec 1, 2022

nunoplopes commented Dec 2, 2022

arsenm commented Dec 2, 2022

nunoplopes commented Dec 4, 2022

nunoplopes commented Dec 4, 2022

jyknight commented Dec 5, 2022 •

edited

nunoplopes commented Dec 5, 2022

jyknight commented Dec 5, 2022

RalfJung commented Dec 5, 2022 •

edited

jcranmer-intel commented Dec 5, 2022

nunoplopes commented Dec 6, 2022

rotateright commented Dec 9, 2022

nunoplopes commented Dec 9, 2022

rotateright commented Dec 9, 2022

rotateright commented Dec 11, 2022

rotateright commented Dec 28, 2022

InstCombine: incorrect fabs formation #59279

InstCombine: incorrect fabs formation #59279

Comments

nunoplopes commented Dec 1, 2022

nunoplopes commented Dec 1, 2022

rotateright commented Dec 1, 2022

jcranmer-intel commented Dec 1, 2022

nunoplopes commented Dec 1, 2022

nunoplopes commented Dec 2, 2022

arsenm commented Dec 2, 2022

nunoplopes commented Dec 4, 2022

nunoplopes commented Dec 4, 2022

jyknight commented Dec 5, 2022 • edited

nunoplopes commented Dec 5, 2022

jyknight commented Dec 5, 2022

RalfJung commented Dec 5, 2022 • edited

jcranmer-intel commented Dec 5, 2022

nunoplopes commented Dec 6, 2022

rotateright commented Dec 9, 2022

nunoplopes commented Dec 9, 2022

rotateright commented Dec 9, 2022

rotateright commented Dec 11, 2022

rotateright commented Dec 28, 2022

jyknight commented Dec 5, 2022 •

edited

RalfJung commented Dec 5, 2022 •

edited