26871 – sitofp conversion to half-precision float generates single-precision float without truncating

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 26871 - sitofp conversion to half-precision float generates single-precision float without truncating

Summary: sitofp conversion to half-precision float generates single-precision float wi...

Status:	RESOLVED FIXED

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Common Code Generator Code (show other bugs)
Version:	trunk
Hardware:	Macintosh MacOS X

Importance:	P normal
Assignee:	Unassigned LLVM Bugs

URL:
Keywords:

Depends on:
Blocks:

Reported:	2016-03-07 18:02 PST by Andres Noetzli
Modified:	2016-05-05 20:02 PDT (History)
CC List:	4 users (show)

See Also:
Fixed By Commit(s):

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Andres Noetzli 2016-03-07 18:02:44 PST

The following program converts -4095 (not precisely representable as half-precision float) to a half-precision float, once at runtime and once at compile time, and adds 1.0:

$ cat ../example.ll
define half @foo(i32 %x) {
  %r = sitofp i32 %x to half
  %rr = fadd half %r, 1.0
  ret half %rr
}

define i1 @main() {
  %x = sitofp i32 -4095 to half
  %xx = fadd half %x, 1.0
  %y = call half @foo(i32 -4095)
  %rr = fcmp oeq half %xx, %y
  ret i1 %rr
}

One would expect the result to be true (1) but the program returns false (0):

$ bin/llc -filetype=obj ../example.ll && clang ../example.o -o a.out && ./a.out; echo $?
0

The reason seems to be that cvtsi2ssl is used to convert the integer to a single-precision float but then the addition is done without truncating the value to a half-precision float before extending it to a single-precision float for the addition (Note: the problem seems to appear on multiple architectures (x86, ARM, ...)):

$ bin/llc -o - ../example.ll
	.section	__TEXT,__text,regular,pure_instructions
	.macosx_version_min 10, 11
	.globl	_foo
	.p2align	4, 0x90
_foo:                                   ## @foo
	.cfi_startproc
## BB#0:
	pushq	%rax
Ltmp0:
	.cfi_def_cfa_offset 16
	cvtsi2ssl	%edi, %xmm0
	movss	%xmm0, 4(%rsp)          ## 4-byte Spill
	movl	$15360, %edi            ## imm = 0x3C00
	callq	___extendhfsf2
	addss	4(%rsp), %xmm0          ## 4-byte Folded Reload
	popq	%rax
	retq
	.cfi_endproc

	.globl	_main
	.p2align	4, 0x90
_main:                                  ## @main
	.cfi_startproc
## BB#0:
	pushq	%rax
Ltmp1:
	.cfi_def_cfa_offset 16
	movl	$-4095, %edi            ## imm = 0xFFFFFFFFFFFFF001
	callq	_foo
	callq	___truncsfhf2
	movzwl	%ax, %edi
	callq	___extendhfsf2
	movss	%xmm0, 4(%rsp)          ## 4-byte Spill
	movl	$60416, %edi            ## imm = 0xEC00
	callq	___extendhfsf2
	movss	4(%rsp), %xmm1          ## 4-byte Reload
                                        ## xmm1 = mem[0],zero,zero,zero
	cmpeqss	%xmm0, %xmm1
	movd	%xmm1, %eax
	andl	$1, %eax
	popq	%rcx
	retq
	.cfi_endproc


.subsections_via_symbols

Comment 1 Ahmed Bougacha 2016-05-05 20:02:54 PDT

Should be fixed by:
r268700 [CodeGen] Round [SU]INT_TO_FP result when promoting from f16.