Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APFloat::fusedMultiplyAdd is broken, leading to incorrect constant folding for libc fmal calls. #21102

Closed
lhames opened this issue Aug 21, 2014 · 9 comments
Assignees
Labels
bugzilla Issues migrated from bugzilla

Comments

@lhames
Copy link
Contributor

lhames commented Aug 21, 2014

Bugzilla Link 20728
Resolution FIXED
Resolved on Nov 21, 2014 00:25
Version trunk
OS All
CC @adibiagio,@dexonsmith,@gregbedwell,@hfinkel,@gottesmm,@rotateright

Extended Description

APFloat::multiplySignificand generates incorrect results when called with Addend != 0 (i.e. from APFloat::fusedMultiplyAdd). This results in incorrect results from libc fmal calls with constant arguments. I have tested this X86, but I believe it will hit any architecture that uses APFloat to handle long double constants.

The output is only incorrect for some constants.
E.g. fmal(3.0L, 4.0L, 5.0L) == 1.0 instead of 17.0L.
However fmal(-3.0L, 4.0L, 5.0L) == -7.0L as expected.

Compiling the following program with 'clang -O3 -o foo foo.c' on X86 demonstrates the bug:

#include <math.h>
#include <stdio.h>

int main(void) {
printf("%Lf\n", fmal(3.0L, 4.0L, 5.0L));
return 0;
}

The misbehavior for the specific constants above was introduced by r181715. The commit message for that revision was: "Fix a bug that APFloat::fusedMultiplyAdd() mistakenly evaluate "14.5f * -14.5f + 225.0f" to 225.0f.". I have confirmed that rolling back to r181714 causes "fmal(3.0L, 4.0L, 5.0L)" to generate correct results, but breaks "14.5f * -14.5f + 225.0f".

@lhames
Copy link
Contributor Author

lhames commented Aug 21, 2014

assigned to @lhames

@lhames
Copy link
Contributor Author

lhames commented Sep 5, 2014

Reproduced locally. I'll investigate shortly.

@lhames
Copy link
Contributor Author

lhames commented Sep 5, 2014

Totally the wrong bug. Disregard my last comment. :)

@lhames
Copy link
Contributor Author

lhames commented Nov 15, 2014

APFloat::fusedMultiplyAdd failure case
Trivial test case derived from:

fmal(1.0L, 1.0L, 3.0L)

To see the failure run:

opt -constprop -S -o - testcase.ll

Note The constant to be returned should be 4.0L (0xK40018000000000000000), but instead it's 0.0L (0xK00000000000000000000).

@lhames
Copy link
Contributor Author

lhames commented Nov 17, 2014

I've tracked this down to APFloat::multiplySignificand. It's due to the way APFloat computes the number of "parts" (uint64 buckets) required to hold the significand. Most methods seem to reference APFloat::partCount() to determine part count:

unsigned int
APFloat::partCount() const
{
return partCountForBits(semantics->precision + 1);
}

static inline unsigned int
partCountForBits(unsigned int bits)
{
return ((bits) + integerPartWidth - 1) / integerPartWidth;
}

However, in multiplySignificand we have:

newPartsCount = partCountForBits(precision * 2);

For an fp80, which has 64 bits of significand, the partCount will return 3 (i.e. 3 x uint64s required for the significand), whereas multiplySignificand will compute newPartsCount = 2.

Both answers are small enough that for the duration of multiplySignificand, the significand is held in a 4 x uint64 scratch space, which is more than enough to never overflow.

In the block that adds the addend to the intermediate result (look for "if (addend && addend->isNonZero())" ...) we call

lost_fraction = addOrSubtractSignificand(extendedAddend, false);

Then addOrSubtractSignificand calls:

return APInt::tcAdd(parts, rhs.significandParts(), 0, partCount());

But now we're telling APInt to do an addition with a partCount of 3, rather than two. For the above constants, fmal(1.0L, 1.0L, 3.0L), The result of the immediate multiplication is 0x0.8p1, and we have 3.0L = 0x1.8p1, so the resulting significands are (both 128 bits wide)

010...0

  • 110...0

This addition overflows into bit 128, but APInt::tcAdd doesn't treat this as an overflow, because we told it that these ints were 3 x 64 bits wide. On return to multiplySignificand lost_fraction is therefore 0, but all subsequent calculations in multiplySignificand work the the internally calculated numPartsCount, so the overflowed bits are silently dropped.

So - that's what's going wrong. The next question is how it should be fixed. It seems very strange to me that so many internal calculations in APFloat are using an extra part. I tried dropping the " + ' in partCount(), i.e:

unsigned int
APFloat::partCount() const
{
return partCountForBits(semantics->precision);
}

This caused a huge number of regression failures that seem to be predominantly caused by use of default initialised (empty) APFloats(), which evidently still want at least one "part" allocated.

Switching to:

unsigned int
APFloat::partCount() const
{
return partCountForBits(std::max(semantics->precision, 1));
}

fixed most of these bugs, but there are still a few failures left.

If this is the right way forward (i.e. if APFloat should only be allocating enough bits to hold the significand and no more) then I just need to track down the remaining bugs. If, on the other hand, APFloat is supposed to allocate an extra part (for overflow?), then we need to teach multiplySignificand to be aware of that when doing the addition.

Any APFloat experts out there want to weigh in on which approach is the right one?

@lhames
Copy link
Contributor Author

lhames commented Nov 19, 2014

Fix for APFloat::multiplySignificand
Attached is a patch that fixes the testcase I posted earlier. It passes all regression tests and the nightly test suite on x86.

The approach I took was to reserve an extra bit for the significand in multiplySignificand into which the add operation can overflow. (i.e. the intermediate significand is held in 2 * precision + 1 bits, where before it had only been 2 * precision bits).

This patch will be submitted to llvm-commits. Commentary should be made there, at least for now, in order to keep the discussion open to people who aren't specifically following this bug.

@lhames
Copy link
Contributor Author

lhames commented Nov 19, 2014

Fixed in r222374.

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 21, 2014

*** Bug llvm/llvm-bugzilla-archive#21623 has been marked as a duplicate of this bug. ***

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 26, 2021

mentioned in issue llvm/llvm-bugzilla-archive#21623

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 9, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

2 participants