996 – MultiSource/Benchmarks/Prolangs-C/allroots broken on x86

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 996 - MultiSource/Benchmarks/Prolangs-C/allroots broken on x86

Summary: MultiSource/Benchmarks/Prolangs-C/allroots broken on x86

Status:	RESOLVED FIXED

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Backend: X86 (show other bugs)
Version:	trunk
Hardware:	PC Linux

Importance:	P normal
Assignee:	Evan Cheng

URL:
Keywords:	miscompilation

Depends on:
Blocks:

Reported:	2006-11-10 12:12 PST by Anton Korobeynikov
Modified:	2018-11-07 00:17 PST (History)
CC List:	3 users (show)

See Also:
Fixed By Commit(s):

Attachments
LLVM bytecode (3.78 KB, application/octet-stream) 2006-11-10 12:13 PST, Anton Korobeynikov	Details
Generated assembler code (9.73 KB, application/octet-stream) 2006-11-10 12:13 PST, Anton Korobeynikov	Details
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Anton Korobeynikov 2006-11-10 12:12:37 PST

This test is definitely broken. Please find attached LLVM bytecode & generated
assembler code.

Generated code segfaults at first "movaps" instruction in newton function. llc
-march=pentium3 generates working code.

Comment 1 Anton Korobeynikov 2006-11-10 12:13:14 PST

Created attachment 456 [details]
LLVM bytecode

Comment 2 Anton Korobeynikov 2006-11-10 12:13:37 PST

Created attachment 457 [details]
Generated assembler code

Comment 3 Evan Cheng 2006-11-10 12:23:04 PST

Anton, could you give us more information on this?

If it crashes on the first movaps in newton:
movaps %xmm0, 48(%esp)

That must means %esp is not properly aligned. Can you verify if this is the case?

Can you tell me if it segfault the first time it reaches this function? Do you
have a backtrace? The bug is likely to be higher up in the caller chain.

Comment 4 Chris Lattner 2006-11-10 12:28:07 PST

what is the stack alignment on linux?  8 bytes, 4 bytes?  I'm pretty sure it's not 16.

Comment 5 Chris Lattner 2006-11-10 12:29:40 PST

Might this be related to Bug 995?

-Chris

Comment 6 Anton Korobeynikov 2006-11-10 12:43:18 PST

Yes, it's that instruction. It crashes the first time entering the function
(e.g. "break newton", "run", and few "stepi"'s leads to crash). Stack seems to
be unaligned (at the point of crash):

(gdb) p $esp
$1 = (void *) 0xbf8e0c08

and that seems to be the problem

Comment 7 Anton Korobeynikov 2006-11-10 12:45:57 PST

Chris, default stack alignment on linux is 4 bytes, but sse stuff usually wants
memory operands to be 16 bytes aligned.

Comment 8 Chris Lattner 2006-11-10 12:47:20 PST

Evan, are we generating movaps for f64 load/stores or something?

-Chris

Comment 9 Evan Cheng 2006-11-10 13:19:31 PST

Yep. We use movaps for f32/f64 load / store. That obviously doesn't work for
non-darwin targets.

Comment 10 Anton Korobeynikov 2006-11-10 13:45:28 PST

Can we just override stack alignment from command line? This will allow use
movaps on non-darwin targets, if stack alignment if fine for this.

Comment 11 Chris Lattner 2006-11-10 13:46:38 PST

nope, stack alignment is part of the ABI.  You can't compile just some functions with aligned stacks: the 
stack coming into the function has to be aligned.

Comment 12 Anton Korobeynikov 2006-11-10 13:48:47 PST

Well, assume we have 4 bytes stack alignment on linux by default. Command line
switch will override stack alignment to e.g. 32. This will enable "movaps" but
doesn't break ABI.

Comment 13 Evan Cheng 2006-11-10 13:52:26 PST

I misspoke. Actually we are not using movaps for f32/f64 load / stores. The
movaps is probably added by the spiller. I'll poke.

Comment 14 Chris Lattner 2006-11-10 14:00:38 PST

this requires either:
1. dynamic readjustment of the stack pointer in the prolog
2. the entire program to be compiled with this option

Since 'entire program' implies libraries (e.g. libc), it isn't feasible.

Comment 15 Evan Cheng 2006-11-10 15:44:30 PST

We should not have been generating spills of vector values in this test case.
This was a bug in the X86 max / min dag combine.
Fixed.

Comment 16 Anton Korobeynikov 2006-11-10 16:42:14 PST

Works for me now, thanks, Evan.