Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[X86] No support for fp stack in inline asm #1251

Closed
llvmbot opened this issue Aug 12, 2006 · 41 comments
Closed

[X86] No support for fp stack in inline asm #1251

llvmbot opened this issue Aug 12, 2006 · 41 comments
Labels
bugzilla Issues migrated from bugzilla compile-fail Use [accepts-invalid] and [rejects-valid] instead llvm:regalloc

Comments

@llvmbot
Copy link
Collaborator

llvmbot commented Aug 12, 2006

Bugzilla Link 879
Resolution FIXED
Resolved on Jun 29, 2011 01:56
Version 1.8
OS Linux
Blocks #1234 llvm/llvm-bugzilla-archive#3696 llvm/llvm-bugzilla-archive#4434
Reporter LLVM Bugzilla Contributor
CC @asl,@DimitryAndric,@EdSchouten,@nlewycky,@pwo,@stoklund

Extended Description

Hi there,

I was compiling lame-3.95.1 with the new tar file llvm-1.8a.tar.gz and
llvm-gcc4-1.8-source on Fedora Core 5. I wanted to check some timings for .wav
to .mp3 encoding.

The lame package compiles fine using llvm-gcc and -O0 but it dies using -O3
(which is the default opt level in the makefile). Gcc 4.0.3 works fine with all
opt levels. Llvm-gcc dies on all opt levels, except -0O.

The compilation goes through the mpglib subdirectory fine but it dies in the
libmp3lame subdirectory with the following:

snip....
llvm-gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../include -I. -I../mpglib/ -I.. -Wall
-pipe -O3 -MT VbrTag.lo -MD -MP -MF .deps/VbrTag.Tpo -c VbrTag.c -fPIC -DPIC -o
.libs/VbrTag.o
VbrTag.c: In function 'PutLameVBR':
VbrTag.c:772: warning: pointer targets in passing argument 1 of
'__builtin_strncpy' differ in signedness
VbrTag.c: In function 'PutVbrTag':
VbrTag.c:1000: warning: pointer targets in passing argument 2 of
'CRC_writeheader' differ in signedness
cc1: SelectionDAGISel.cpp:2142: void
llvm::SelectionDAGLowering::visitInlineAsm(llvm::CallInst&): Assertion
`!Regs.Regs.empty() && "Couldn't allocate output reg!"' failed.
VbrTag.c: At top level:
VbrTag.c:1030: internal compiler error: Aborted
snip...

I don't know enough about llvm at present, to offer an intelligent guess as to
why this is happening. Sorry ;)

Lame source can be downloaded here:
http://prdownloads.sourceforge.net/lame/lame-3.95.1.tar.gz

Thanks,
Kelly

@lattner
Copy link
Collaborator

lattner commented Aug 12, 2006

Can you please attach a ".i" file, generated according to these instructions:
http://llvm.org/docs/HowToSubmitABug.html#front-end

Thanks,

-Chris

@llvmbot
Copy link
Collaborator Author

llvmbot commented Aug 13, 2006

Crashes llvm-gcc
A .i file of the offending VbrTag.c file.

@lattner
Copy link
Collaborator

lattner commented Aug 17, 2006

Here is a reduced testcase:

double floor(double __x) {
register long double __value;
__volatile unsigned short int __cw;
__volatile unsigned short int __cwtmp;
__asm           __volatile("fnstcw %0":"=m"(__cw));
__cwtmp = (__cw & 0xf3ff) | 0x0400;
__asm           __volatile("fldcw %0"::"m"(__cwtmp));
__asm           __volatile("frndint":"=t"(__value):"0"(__x));
__asm           __volatile("fldcw %0"::"m"(__cw));
return __value;
}

Basically this boils down to the fact that we don't support "t" constraints (fp stack) in inline asms. 

Unfortunately, I don't plan to implement this support in the near future, however, as a work-around,
most linux distros allow you to compile with -D__NO_MATH_INLINES, which disables these.

-Chris

@lattner
Copy link
Collaborator

lattner commented Aug 25, 2006

Here's a testcase from redhat FC5 headers:

double %intcoord(double %X) {
%tmp85 = call double asm sideeffect "frndint", "={st},0,{dirflag},{fpsr},~{flags}"( double %X)
ret double %tmp85
}

It would be good to support the simple case of this at least.

-Chris

@llvmbot
Copy link
Collaborator Author

llvmbot commented Aug 25, 2006

This patch:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20060821/037043.html
was applied to work around this problem on x86/Linux. It needs to be undone once
this bug is resolved.

@lattner
Copy link
Collaborator

lattner commented Aug 29, 2006

*** Bug #1243 has been marked as a duplicate of this bug. ***

@lattner
Copy link
Collaborator

lattner commented Sep 22, 2006

*** Bug llvm/llvm-bugzilla-archive#921 has been marked as a duplicate of this bug. ***

@lattner
Copy link
Collaborator

lattner commented Oct 6, 2006

*** Bug llvm/llvm-bugzilla-archive#937 has been marked as a duplicate of this bug. ***

@asl
Copy link
Collaborator

asl commented Oct 29, 2006

Another simple testcase from Qt:

inline double qCos_x86(double a)
{
double r;
asm (
"fcos"
: "=t" (r) : "0" (a));
return r;
}

@lattner
Copy link
Collaborator

lattner commented Nov 1, 2006

Here are some testcases:

double floor(double __x) {
double __value;
__asm ("frndint":"=t"(__value):"0"(__x));
return __value;
}

// This has two inputs and pops one.
int test(double x, double y) {

register char __result;
asm ("fucomip %%st(1), %%st; seta %%al"
: "=a" (__result) : "u" (y), "t" (x) : "cc", "st");
return __result;
}
double test2(double __x) {
double __value;
__asm volatile("fscale" : "=t" (__value) : "0" (1.0), "u" (__x));
return __value;
}

Unfortunately, this won't happen for 1.9.

-Chris

@asl
Copy link
Collaborator

asl commented Dec 16, 2006

*** Bug llvm/llvm-bugzilla-archive#1051 has been marked as a duplicate of this bug. ***

@lattner
Copy link
Collaborator

lattner commented Feb 20, 2007

*** Bug llvm/llvm-bugzilla-archive#1211 has been marked as a duplicate of this bug. ***

@asl
Copy link
Collaborator

asl commented Mar 26, 2007

*** Bug llvm/llvm-bugzilla-archive#1272 has been marked as a duplicate of this bug. ***

@lattner
Copy link
Collaborator

lattner commented Mar 26, 2007

akor, could you try adding something like this to i386.h:

builtin_assert("__NO_MATH_INLINES");

Somewhere in TARGET_CPU_CPP_BUILTINS. This should eliminate the worst part of this problem until we
can fix it right.

@lattner
Copy link
Collaborator

lattner commented Mar 26, 2007

s/builtin_assert/builtin_define, sorry.

@asl
Copy link
Collaborator

asl commented Mar 26, 2007

Ok. Will try.

@asl
Copy link
Collaborator

asl commented Mar 27, 2007

I'm ok with the following patch:

diff -r 4fe28e061ca4 gcc/config/i386/i386.h
--- a/gcc/config/i386/i386.h Sun Mar 25 09:02:33 2007 +0000
+++ b/gcc/config/i386/i386.h Tue Mar 27 21:01:00 2007 +0400
@@ -746,8 +746,11 @@ extern int x86_prefetch_sse;
{
builtin_define ("__nocona");
builtin_define ("nocona"); \

  •   }                                                       \
    
  • } \
  •   }                                                 \
    
  •  /*APPLE LOCAL begin LLVM llvm/llvm-bugzilla-archive#879  workaround */  \
    
  •  builtin_define("__NO_MATH_INLINES");          \
    
  •  /*APPLE LOCAL end LLVM llvm/llvm-bugzilla-archive#879  workaround */    \
    
  • }
    while (0)

#define TARGET_CPU_DEFAULT_i386 0

Btw, is it ok, that apple-related parts uses tabs for indentation?

@lattner
Copy link
Collaborator

lattner commented Mar 30, 2007

Thanks Anton, applied:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070326/046493.html

I think this should significantly reduce the pain of this bug for linux users.

@lattner
Copy link
Collaborator

lattner commented Mar 10, 2008

Simple asms like that in Comment #​9 now work:

_qCos_x86:
fldl 4(%esp)
# InlineAsm Start
fcos
# InlineAsm End
#FP_REG_KILL
ret

@asl
Copy link
Collaborator

asl commented Mar 11, 2008

Simple asms like that in Comment #​9 now work:
Nice!

@lattner
Copy link
Collaborator

lattner commented Mar 11, 2008

There are a bunch more examples here:
http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Extended-Asm

@lattner
Copy link
Collaborator

lattner commented Mar 11, 2008

'f' now works. 'u' doesn't. 't' does, but probably not in combination with 'f' yet. Slow progress.

@asl
Copy link
Collaborator

asl commented Oct 16, 2008

*** Bug llvm/llvm-bugzilla-archive#2901 has been marked as a duplicate of this bug. ***

@lattner
Copy link
Collaborator

lattner commented Feb 14, 2009

*** Bug llvm/llvm-bugzilla-archive#3582 has been marked as a duplicate of this bug. ***

@asl
Copy link
Collaborator

asl commented Mar 1, 2009

*** Bug llvm/llvm-bugzilla-archive#3690 has been marked as a duplicate of this bug. ***

@EdSchouten
Copy link
Contributor

We at FreeBSD see the following crash when building our C library on i386:

Assertion failed: (StackTop == 0 && "Stack should be empty after a call!"), function handleSpecialFP, file .../llvm/lib/Target/X86/X86FloatingPoint.cpp, line 952.

Reduced testcase:

double
ldexp(double value, int exp)
{
double temp, texp, temp2;
texp = exp;

__asm ("fscale " : "=u" (temp2), "=t" (temp) : "0" (texp), "1" (value));
return (temp);

}

@llvmbot
Copy link
Collaborator Author

llvmbot commented Jul 18, 2009

is there some progress on this? it seems to block quite a few projects

@llvmbot
Copy link
Collaborator Author

llvmbot commented Aug 10, 2009

No progress. :-( All the Apply guys are crazy busy with other projects. I don't know if there are folks in the community who are working on this.

Is there a way to work around this at the source level?

@asl
Copy link
Collaborator

asl commented Aug 10, 2009

No progress. :-( All the Apply guys are crazy busy with other projects. I
don't know if there are folks in the community who are working on this.

Is there a way to work around this at the source level?
Usually - no, unfortunately...

@llvmbot
Copy link
Collaborator Author

llvmbot commented Oct 11, 2009

aKor says it might just work to turn 'u' to %st(1) in the frontend (clang).

info gcc says this:

`u'
      Second from top of 80387 floating-point stack (`%st(1)').

@asl
Copy link
Collaborator

asl commented Nov 16, 2009

*** Bug llvm/llvm-bugzilla-archive#5505 has been marked as a duplicate of this bug. ***

@lattner
Copy link
Collaborator

lattner commented Feb 8, 2010

*** Bug llvm/llvm-bugzilla-archive#6252 has been marked as a duplicate of this bug. ***

@llvmbot
Copy link
Collaborator Author

llvmbot commented Feb 25, 2010

aKor says it might just work to turn 'u' to %st(1) in the frontend (clang).

info gcc says this:

`u'
      Second from top of 80387 floating-point stack (`%st(1)').

Just a comment on the FreeBSD build dependency. I can find two uses of "u" in the FreeBSD source tree:

src/lib/libc/amd64/gen/ldexp.c:55: __asm ("fscale "
src/lib/libc/amd64/gen/ldexp.c-56- : "=u" (temp2), "=t" (temp)

--
src/lib/libc/i386/gen/ldexp.c:57: __asm ("fscale "
src/lib/libc/i386/gen/ldexp.c-58- : "=u" (temp2), "=t" (temp)

@llvmbot
Copy link
Collaborator Author

llvmbot commented May 25, 2010

is there some progress on this? it seems to block quite a few projects

It's seems I can compile these code using clang 1.1 and llvm 2.7.

the result assembly code is

.file	"a.c"
.text
.globl	ldexp
.align	16, 0x90
.type	ldexp,@function

ldexp:
pushl %ebp
movl %esp, %ebp
subl $80, %esp
movsd 8(%ebp), %xmm0
movl 16(%ebp), %eax
movsd %xmm0, -16(%ebp)
movl %eax, -20(%ebp)
movl -20(%ebp), %eax
cvtsi2sd %eax, %xmm0
movsd %xmm0, -40(%ebp)
movsd -40(%ebp), %xmm0
movsd -16(%ebp), %xmm1
movsd %xmm1, -64(%ebp)
fldl -64(%ebp)
movsd %xmm0, -56(%ebp)
fldl -56(%ebp)
#APP
fscale
#NO_APP
fxch %st(1)
fstpl -72(%ebp)
movsd -72(%ebp), %xmm0
fstpl -80(%ebp)
movsd -80(%ebp), %xmm1
movsd %xmm0, -48(%ebp)
movsd %xmm1, -32(%ebp)
movsd %xmm1, -8(%ebp)
fldl -8(%ebp)
addl $80, %esp
popl %ebp
ret
.size ldexp, .-ldexp

.section	.note.GNU-stack,"",@progbits

Is that means this bug is fixed?

@pwo
Copy link

pwo commented Jul 11, 2010

is there some progress on this? it seems to block quite a few projects

It's seems I can compile these code using clang 1.1 and llvm 2.7.

The ldexp testcase works at >=-O1 but sill fails at -O0, clang r108078.

@lattner
Copy link
Collaborator

lattner commented Oct 2, 2010

Reverified. Jakob's work on the stackifier fixed at least one of these, but we still get some assertions, such as:

Assertion failed: (RegMap[RegOnTop] < StackTop), function moveToTop, file X86FloatingPoint.cpp, line 200.
7 clang 0x0000000100e2351c (anonymous namespace)::FPS::moveToTop(unsigned int, llvm::ilist_iteratorllvm::MachineInstr) + 332
8 clang 0x0000000100e216b9 (anonymous namespace)::FPS::handleSpecialFP(llvm::ilist_iteratorllvm::MachineInstr&) + 953
9 clang 0x0000000100e1f1f8 (anonymous namespace)::FPS::processBasicBlock(llvm::MachineFunction&, llvm::MachineBasicBlock&) + 1032
10 clang 0x0000000100e1e2ef (anonymous namespace)::FPS::runOnMachineFunction(llvm::MachineFunction&) + 463

@stoklund
Copy link
Mannequin

stoklund mannequin commented Oct 4, 2010

To summarize the current state:

  • "t" and "u" are both supported, but they must appear in that order. Same for "st(0)", "st(1)".
  • Dead outputs are not properly popped.
  • We don't support pushing or popping more than two stack entries.

@DimitryAndric
Copy link
Collaborator

I received a small piece of sample code (basically a rint() function)
where clang also doesn't seem to pop the FP stack properly, and I think
it belongs under this PR:

int irint(double x)
{
int n;
asm("fistl %0" : "=m" (n) : "t" (x));
return (n);
}

gcc produces:

irint:
pushl %ebp
movl %esp, %ebp
subl $16, %esp
fldl 8(%ebp)
#APP
fistl -4(%ebp)
#NO_APP
fstp %st(0)
movl -4(%ebp), %eax
leave
ret

while clang produces:

irint:
pushl %ebp
movl %esp, %ebp
subl $4, %esp
fldl 8(%ebp)
#APP
fistl -4(%ebp)
#NO_APP
movl -4(%ebp), %eax
addl $4, %esp
popl %ebp
ret

E.g, it loads the argument on the fp stack (fldl 8(%ebp)), but does not
pop it.

@stoklund
Copy link
Mannequin

stoklund mannequin commented Jun 28, 2011

Fixed in r134018

Chris' tests now produce:

_floor: ## @​floor

BB#0: ## %entry

pushl	%ebp
movl	%esp, %ebp
fldl	8(%ebp)
## InlineAsm Start
frndint
## InlineAsm End
popl	%ebp
ret

.globl	_test
.align	4, 0x90

_test: ## @​test

BB#0: ## %entry

pushl	%ebp
movl	%esp, %ebp
fldl	16(%ebp)
fldl	8(%ebp)
## InlineAsm Start
fucomip %st(1), %st; seta %al
## InlineAsm End
fstp	%st(0)
movsbl	%al, %eax
popl	%ebp
ret

.globl	_test2
.align	4, 0x90

_test2: ## @​test2

BB#0: ## %entry

pushl	%ebp
movl	%esp, %ebp
fld1
fldl	8(%ebp)
fxch
## InlineAsm Start
fscale
## InlineAsm End
fstp	%st(1)
popl	%ebp
ret

Roman's ldexp:

_ldexp: ## @​ldexp

BB#0: ## %entry

pushl	%ebp
movl	%esp, %ebp
pushl	%eax
movl	16(%ebp), %eax
movl	%eax, -4(%ebp)
fildl	-4(%ebp)
fldl	8(%ebp)
## InlineAsm Start
fscale 
## InlineAsm End
fstp	%st(1)
addl	$4, %esp
popl	%ebp
ret

Dimitry:

_irint: ## @​irint

BB#0: ## %entry

pushl	%ebp
movl	%esp, %ebp
pushl	%eax
fldl	8(%ebp)
## InlineAsm Start
fistl -4(%ebp)
## InlineAsm End
fstp	%st(0)
movl	-4(%ebp), %eax
addl	$4, %esp
popl	%ebp
ret

@lattner
Copy link
Collaborator

lattner commented Jun 29, 2011

Fantastic Jakob!

@lattner
Copy link
Collaborator

lattner commented Nov 27, 2021

mentioned in issue llvm/llvm-bugzilla-archive#937

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 3, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla compile-fail Use [accepts-invalid] and [rejects-valid] instead llvm:regalloc
Projects
None yet
Development

No branches or pull requests

6 participants