New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
APFloat::fusedMultiplyAdd is broken, leading to incorrect constant folding for libc fmal calls. #21102
Comments
assigned to @lhames |
Reproduced locally. I'll investigate shortly. |
Totally the wrong bug. Disregard my last comment. :) |
APFloat::fusedMultiplyAdd failure case fmal(1.0L, 1.0L, 3.0L) To see the failure run: opt -constprop -S -o - testcase.ll Note The constant to be returned should be 4.0L (0xK40018000000000000000), but instead it's 0.0L (0xK00000000000000000000). |
I've tracked this down to APFloat::multiplySignificand. It's due to the way APFloat computes the number of "parts" (uint64 buckets) required to hold the significand. Most methods seem to reference APFloat::partCount() to determine part count: unsigned int static inline unsigned int However, in multiplySignificand we have: newPartsCount = partCountForBits(precision * 2); For an fp80, which has 64 bits of significand, the partCount will return 3 (i.e. 3 x uint64s required for the significand), whereas multiplySignificand will compute newPartsCount = 2. Both answers are small enough that for the duration of multiplySignificand, the significand is held in a 4 x uint64 scratch space, which is more than enough to never overflow. In the block that adds the addend to the intermediate result (look for "if (addend && addend->isNonZero())" ...) we call lost_fraction = addOrSubtractSignificand(extendedAddend, false); Then addOrSubtractSignificand calls: return APInt::tcAdd(parts, rhs.significandParts(), 0, partCount()); But now we're telling APInt to do an addition with a partCount of 3, rather than two. For the above constants, fmal(1.0L, 1.0L, 3.0L), The result of the immediate multiplication is 0x0.8p1, and we have 3.0L = 0x1.8p1, so the resulting significands are (both 128 bits wide) 010...0
This addition overflows into bit 128, but APInt::tcAdd doesn't treat this as an overflow, because we told it that these ints were 3 x 64 bits wide. On return to multiplySignificand lost_fraction is therefore 0, but all subsequent calculations in multiplySignificand work the the internally calculated numPartsCount, so the overflowed bits are silently dropped. So - that's what's going wrong. The next question is how it should be fixed. It seems very strange to me that so many internal calculations in APFloat are using an extra part. I tried dropping the " + ' in partCount(), i.e: unsigned int This caused a huge number of regression failures that seem to be predominantly caused by use of default initialised (empty) APFloats(), which evidently still want at least one "part" allocated. Switching to: unsigned int fixed most of these bugs, but there are still a few failures left. If this is the right way forward (i.e. if APFloat should only be allocating enough bits to hold the significand and no more) then I just need to track down the remaining bugs. If, on the other hand, APFloat is supposed to allocate an extra part (for overflow?), then we need to teach multiplySignificand to be aware of that when doing the addition. Any APFloat experts out there want to weigh in on which approach is the right one? |
Fix for APFloat::multiplySignificand The approach I took was to reserve an extra bit for the significand in multiplySignificand into which the add operation can overflow. (i.e. the intermediate significand is held in 2 * precision + 1 bits, where before it had only been 2 * precision bits). This patch will be submitted to llvm-commits. Commentary should be made there, at least for now, in order to keep the discussion open to people who aren't specifically following this bug. |
Fixed in r222374. |
*** Bug llvm/llvm-bugzilla-archive#21623 has been marked as a duplicate of this bug. *** |
mentioned in issue llvm/llvm-bugzilla-archive#21623 |
Extended Description
APFloat::multiplySignificand generates incorrect results when called with Addend != 0 (i.e. from APFloat::fusedMultiplyAdd). This results in incorrect results from libc fmal calls with constant arguments. I have tested this X86, but I believe it will hit any architecture that uses APFloat to handle long double constants.
The output is only incorrect for some constants.
E.g. fmal(3.0L, 4.0L, 5.0L) == 1.0 instead of 17.0L.
However fmal(-3.0L, 4.0L, 5.0L) == -7.0L as expected.
Compiling the following program with 'clang -O3 -o foo foo.c' on X86 demonstrates the bug:
#include <math.h>
#include <stdio.h>
int main(void) {
printf("%Lf\n", fmal(3.0L, 4.0L, 5.0L));
return 0;
}
The misbehavior for the specific constants above was introduced by r181715. The commit message for that revision was: "Fix a bug that APFloat::fusedMultiplyAdd() mistakenly evaluate "14.5f * -14.5f + 225.0f" to 225.0f.". I have confirmed that rolling back to r181714 causes "fmal(3.0L, 4.0L, 5.0L)" to generate correct results, but breaks "14.5f * -14.5f + 225.0f".
The text was updated successfully, but these errors were encountered: