Double Signaling NaN converted to float becomes INF #43252

AMDG2 · 2019-11-05T13:14:26Z


Bugzilla Link	43907
Resolution	FIXED
Resolved on	Sep 30, 2020 06:28
Version	11.0
OS	Windows NT
Blocks	#14079 #46070
CC	@CaseyCarter,@thomcc,@zmodem,@RKSimon,@RalfJung,@rotateright

Extended Description

The following piece of code produce incorrect result with clang (and clang-cl) on Windows:

#include <cmath>
#include <limits>

int main() {
    const double quiet_nan_double = std::numeric_limits<double>::quiet_NaN();
    const float converted_to_float = float(quiet_nan_double);
    const bool isnan = std::isnan(converted_to_float);
    return isnan ? 1 : 0;
}

The program should return 1. It does on Linux with majors compilers. On Windows clang (and clang-cl) returns 0 no matter what compilation flags I tried. Here are some flags I used:

clang -O3 -Wall -Wextra -g nan_d2f.cpp
clang -O0 -Wall -Wextra -g nan_d2f.cpp
clang -Wall -Wextra -g nan_d2f.cpp
clang-cl /Ox /W4 /EHsc nan_d2f.cpp
clang-cl /W4 /EHsc nan_d2f.cpp

Here is version reported by clang -v:

clang version 9.0.0 (tags/RELEASE_900/final)
Target: x86_64-pc-windows-msvc
Thread model: posix
InstalledDir: C:\scoop\apps\llvm\current\bin

It was downloaded from here: https://releases.llvm.org/9.0.0/LLVM-9.0.0-win64.exe#/dl.7z

Let me know if you need more info or cannot reproduce.

The text was updated successfully, but these errors were encountered:

AMDG2 · 2019-11-05T13:16:58Z

Sorry, the initial shared code is wrong. Here is the correct one:

#include <cmath>
#include <limits>

int main() {
    const double signaling_nan_double = std::numeric_limits<double>::signaling_NaN();
    const float converted_to_float = float(signaling_nan_double);
    const bool isnan = std::isnan(converted_to_float);
    return isnan ? 1 : 0;
}

rotateright · 2019-11-05T14:48:21Z

I can't reproduce this on macOS or Linux:
https://godbolt.org/z/2ciTCf

Can you output LLVM IR as shown in the above link and paste it here?

AMDG2 · 2019-11-05T14:56:21Z

It works as expected on macOS and Linux. Only issue is with Windows.

I am attaching the LLVM IR I just generated with the following command line:

clang -g -o nan_d2f_clang.llvm -mllvm --x86-asm-syntax=intel -S -fcolor-diagnostics -fno-crash-diagnostics -O2 -g0 -emit-llvm nan_d2f.cpp

AMDG2 · 2019-11-05T14:57:22Z

LLVM IR
Command line:

clang -g -o nan_d2f_clang.llvm -mllvm --x86-asm-syntax=intel -S -fcolor-diagnostics -fno-crash-diagnostics -O2 -g0 -emit-llvm nan_d2f.cpp

rotateright · 2019-11-05T15:04:44Z

It works as expected on macOS and Linux. Only issue is with Windows.

I don't know much about Windows, so someone else will have to investigate further. Here's the IR that I see in the attachment:

define dso_local i32 @main() local_unnamed_addr #0 {
%1 = alloca float, align 4
%2 = bitcast float* %1 to i8*
call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %2) #3
store float 0x7FF0000000000000, float* %1, align 4, !tbaa !8
%3 = call i16 @_fdtest(float* nonnull %1) #3
call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %2) #3
%4 = icmp eq i16 %3, 2
%5 = zext i1 %4 to i32
ret i32 %5
}

declare dso_local i16 @_fdtest(float*) local_unnamed_addr #1

I've never seen "_fdtest" before...
https://devblogs.microsoft.com/cppblog/improving-the-performance-of-standard-library-functions/

AMDG2 · 2019-12-09T10:39:56Z

I reported the bug to Microsoft as it seems it is related to their STL. They were not able to reproduce the bug so I setup a repository in Azure DevOps to reproduce the bug. If that can help anyone to reproduce it I am sharing it here: https://dev.azure.com/feildel/clang-bug-nan-d2f

CaseyCarter · 2019-12-13T18:19:35Z

Minimal repro:

#include <assert.h>
#include <math.h>

int main() {
assert(isnan(float(__builtin_nans("1"))));
}

The assertion fires when compiled with clang-cl:

clang version 9.0.0 (tags/RELEASE_900/final)
Target: x86_64-pc-windows-msvc
Thread model: posix

but not for MSVC (or gcc/clang on Ubuntu, for that matter) per https://godbolt.org/z/65wWKG.

thomcc · 2020-09-14T14:57:35Z

In rust-lang/rust#69532 we've hit this bug, and I'm pretty confident I've figured out the issue.

Here's a minimal example that just operates on IR https://godbolt.org/resetlayout/Zf7HMh , in case there's not one already in this thread.

Here's a more illustrative test case that demonstrates what's happening: https://godbolt.org/z/c59E4E . It seems that the bottom 29 bits of the NaN payload are being discarded.

I believe the issue is here:

llvm-project/llvm/lib/Support/APFloat.cpp

Lines 2203 to 2205 in 9868ea7

    
           // If this is a truncation, perform the shift before we narrow the storage. 
        
           if (shift < 0 && (isFiniteNonZero() || category==fcNaN)) 
        
             lostFraction = shiftRight(significandParts(), oldPartCount, -shift);

(I also believe L2229-L2232 (shifting right) very likely suffers from a similar issue)

When converting from binary64 to binary32, the -shift value will evaluate to 29, which explains why the bottom 29 bits are gone.

I think LLVM has leeway with what to do precisely, but:

IMO APFloat should properly preserve NaN payloads where possible, at least at compile time.
Even if not, compiling NaN => Inf should be fixed.

So, for the first part, I don't think IEEE 754 comes out an explicitly says that you should truncate the most significant bits when going from a higher-precision format to a lower precision one (I could be wrong), but it seems to imply it, at least for qNaN (sNaN is obviously implementation defined but it's hard for me to see why it should be treated very much differently to qNaN here):

Conversion of a quiet NaN from a narrower format to a wider format in
the same radix, and then back to thesame narrower format, should not
change the quiet NaN payload in any way except to make it canonical.

6.2.3 NaN propagation (in 754-2008 anyway, I don't have 754-2019)

That is, binary32 => binary64 => binary32 should be unchanged, which it would be for truncating the most significant bits, but not for shifting, or any other obvious method of discarding a subset of the bits.

That said, this might just move your issue to the high bits unless you also ensure the output is a NaN if the input is one. I think this just needs to be an explicit check.

I provided some rust code that does these things for rustc's port of APFloat in rust-lang/rust#69532 (comment) (it could be very wrong but probably gets the idea across — someone who works more regularly/deeply with IEEE 754 should take a look and possibly correct my mistakes — that's very welcome!).

TL; DR:

APFloat does wrong things with NaN payloads here:

llvm-project/llvm/lib/Support/APFloat.cpp

Lines 2203 to 2205 in 9868ea7

    
           // If this is a truncation, perform the shift before we narrow the storage. 
        
           if (shift < 0 && (isFiniteNonZero() || category==fcNaN)) 
        
             lostFraction = shiftRight(significandParts(), oldPartCount, -shift);

(and probably elsewhere in this function, and maybe elsewhere in APFloat).

This can turn NaNs into infinities (and cause APFloats to have categories that do not match the categories of their bitpatterns).
Easy fix is just to ensure that NaN in => NaN out.
Slightly harder fix is to ensure (at least in convert) that it's the upper bits of the NaN's payload that are discarded (not lower).
Miri floating point NaN conversion issue rust-lang/rust#69532 (comment) has some rust code that performs an equivalent operation for Rust's APFloat port, but it's slapdash and may have bugs. Seems to work tho.

P.S. I was filing this as a new issue when I noticed this at the last minute, which is why it's phrased like it would be a new issue.

RalfJung · 2020-09-17T07:31:18Z

Is there some APFloat component that this can be assigned to? Would be good to get the APFloat maintainer(s) to look at this.

rotateright · 2020-09-17T15:39:04Z

I don't know if anyone is an APFloat maintainer/authority at this point, but I created patch proposal here:
https://reviews.llvm.org/D87835

rotateright · 2020-09-24T19:40:53Z

APFloat minimally fixed here:
https://reviews.llvm.org/rGe34bd1e0b03d

Can someone verify that the code on Windows behaves as expected now?

We will hopefully fix this better with:
https://reviews.llvm.org/D88238

AMDG2 · 2020-09-25T11:12:21Z

The reported issue is fixed on Windows using clang & clang-cl built from commit e34bd1e.

Thanks for your work!

AMDG2 · 2020-09-29T07:48:19Z

Will this fix make it to LLVM 11.0? Or shall we wait for LLVM 12?

Also, I don't know if I should switch the status of the bug to resolved, or if some people are responsible for this.

rotateright · 2020-09-29T12:28:52Z

Will this fix make it to LLVM 11.0? Or shall we wait for LLVM 12?

Also, I don't know if I should switch the status of the bug to resolved, or
if some people are responsible for this.

IIUC, this was not a recent regression, and the 11.0 release is close to done, but there's still a chance...
cc'ing Hans and marking as blocker.

zmodem · 2020-09-29T12:49:37Z

Will this fix make it to LLVM 11.0? Or shall we wait for LLVM 12?

Also, I don't know if I should switch the status of the bug to resolved, or
if some people are responsible for this.

IIUC, this was not a recent regression, and the 11.0 release is close to
done, but there's still a chance...
cc'ing Hans and marking as blocker.

If we have to spin another release candidate (rc5) we can pick up this fix, but otherwise it will have to wait for 11.0.1.

I've made a note, but will remove it from the blockers list for now.

zmodem · 2020-09-30T11:36:59Z

There will be an rc5, so I've pushed this to 11.x as 60a2520.

rotateright · 2020-09-30T13:28:41Z

There will be an rc5, so I've pushed this to 11.x as
60a2520.

Great - thank you!

zmodem · 2021-11-27T03:24:07Z

mentioned in issue #46070

zmodem mentioned this issue Jul 15, 2020

[meta] 11.0.0 Release Blockers #46070

Closed

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Double Signaling NaN converted to float becomes INF #43252

Double Signaling NaN converted to float becomes INF #43252

AMDG2 commented Nov 5, 2019

AMDG2 commented Nov 5, 2019

rotateright commented Nov 5, 2019

AMDG2 commented Nov 5, 2019

AMDG2 commented Nov 5, 2019

rotateright commented Nov 5, 2019

AMDG2 commented Dec 9, 2019

CaseyCarter commented Dec 13, 2019

thomcc mannequin commented Sep 14, 2020

RalfJung commented Sep 17, 2020

rotateright commented Sep 17, 2020

rotateright commented Sep 24, 2020

AMDG2 commented Sep 25, 2020

AMDG2 commented Sep 29, 2020

rotateright commented Sep 29, 2020

zmodem commented Sep 29, 2020

zmodem commented Sep 30, 2020

rotateright commented Sep 30, 2020

zmodem commented Nov 27, 2021

Double Signaling NaN converted to float becomes INF #43252

Double Signaling NaN converted to float becomes INF #43252

Comments

AMDG2 commented Nov 5, 2019

Extended Description

AMDG2 commented Nov 5, 2019

rotateright commented Nov 5, 2019

AMDG2 commented Nov 5, 2019

AMDG2 commented Nov 5, 2019

rotateright commented Nov 5, 2019

AMDG2 commented Dec 9, 2019

CaseyCarter commented Dec 13, 2019

thomcc mannequin commented Sep 14, 2020

RalfJung commented Sep 17, 2020

rotateright commented Sep 17, 2020

rotateright commented Sep 24, 2020

AMDG2 commented Sep 25, 2020

AMDG2 commented Sep 29, 2020

rotateright commented Sep 29, 2020

zmodem commented Sep 29, 2020

zmodem commented Sep 30, 2020

rotateright commented Sep 30, 2020

zmodem commented Nov 27, 2021