LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 711 - [linscan] Coallescing physregs with virtregs should allow spilling the virtreg portion
Summary: [linscan] Coallescing physregs with virtregs should allow spilling the virtre...
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Common Code Generator Code (show other bugs)
Version: 1.0
Hardware: All All
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords: compile-fail
Depends on:
Blocks: 699
  Show dependency tree
 
Reported: 2006-02-23 01:50 PST by Chris Lattner
Modified: 2010-02-22 12:55 PST (History)
4 users (show)

See Also:
Fixed By Commit(s):


Attachments
Failed bytecode (244.93 KB, application/octet-stream)
2007-03-20 11:41 PDT, Anton Korobeynikov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Lattner 2006-02-23 01:50:34 PST
This problem is causing fastcc on x86 to fail in some corner cases.  Here's a testcase:

---
target endian = little
target pointersize = 32
target triple = "i686-pc-linux-gnu"
%.str_28 = external global [24 x sbyte]		; <[24 x sbyte]*> [#uses=1]

implementation   ; Functions:

fastcc void %apply_stencil_op_to_pixels(int* %tmp.414, uint %n, 
int* %x, int* %y, uint %oper, ubyte* %mask) {
entry:
	switch uint %oper, label %label.5 [
		 uint 5386, label %label.6
	]

label.5:		; preds = %entry
	br label %no_exit.8

no_exit.8:		; preds = %then.17, %label.5
	%tmp.409 = getelementptr ubyte* %mask, uint 0		; <ubyte*> [#uses=1]
	%tmp.410 = load ubyte* %tmp.409		; <ubyte> [#uses=0]
	br label %then.17

then.17:		; preds = %no_exit.8
	%tmp.415 = load int* %tmp.414
	%tmp.417 = load ubyte** null		; <ubyte*> [#uses=1]
	%tmp.425 = getelementptr int* %y, uint 0		; <int*> [#uses=1]
	%tmp.426 = load int* %tmp.425		; <int> [#uses=1]
	%tmp.437 = getelementptr ubyte* %tmp.417, int %tmp.426
		; <ubyte*> [#uses=2]
	%tmp.440 = load ubyte* %tmp.437		; <ubyte> [#uses=1]
	store ubyte %tmp.440, ubyte* %tmp.437
	%tmp.405101 = setlt uint 0, %n		; <bool> [#uses=1]
	br bool %tmp.405101, label %no_exit.8, label %UnifiedReturnBlock

label.6:		; preds = %entry
	%tmp.4.i = tail call uint %fwrite( sbyte* getelementptr ([24 x sbyte]* %.str_28, int 0, int 0), uint 23, 
uint 1, sbyte* null )		; <uint> [#uses=0]
	ret void

UnifiedReturnBlock:		; preds = %then.17, %no_exit.8
	ret void
}

declare int %fprintf(%struct._IO_FILE*, sbyte*, ...)

declare uint %fwrite(sbyte*, uint, uint, sbyte*)
---

Compiled with 'llc -enable-x86-fastcc', this crashes llc.

This is due to the coallescer coallescing virtregs with both EAX and EDX, which makes them unavailable 
to satisfy spills, causing the RA to run out of registers.  We want to coallesce physregs when possible, 
but we cannot pin them in the spiller: we have to be able to uncoallesce them.

This is almost certainly related to Bug 699.

-Chris
Comment 1 Anton Korobeynikov 2007-01-29 11:52:35 PST
Now with inreg patch landed, it seems, that this bug is now more important. For
example, this (currently) breaks Qt:

Consider we're having regparm(3) function being compiled in PIC mode on
x86/Linux. So, in general, eax, ebx, ecx & edx are used in the early entry of
function. It seems, that register allocator can't handle such corner situation.
llc -debug shows, that is just "run out of registers" and after go into infinite
cycle. Compiling the same function in non-PIC mode (so, making ebx free) will
allow register allocator to correctly handle this case.

Cheap workaround is just lower regparm(3) to regparm(2). But, it seems, many
libraries are using stdcall + regparm(3) as "fast" CC and it will be definitely
better to support this. At least, infinite cycling is not good :)

I can provide additional .ll, which causes llc to cycle.
Comment 2 Chris Lattner 2007-01-29 12:14:23 PST
Evan, can you think of a reasonably simple way to work around this problem in the short-term?

-Chris
Comment 3 Evan Cheng 2007-01-30 19:13:04 PST
I have to spend some time (which I don't have right now) to figure out a fix. Do
you suppose it's easy for the coalescer to detect that a liverange has been
coalesced to more than one physical register?
Comment 4 Chris Lattner 2007-01-30 23:15:04 PST
I don't think that's the issue.  Consider a two register machine with code like this:

vreg1 = r1
vreg2 = r2

vreg3 = some operation

use vreg1
use vreg2

Right now, the coallescer will coallesce vreg1 with r1 because the live ranges don't overlap.  Then it 
coallesces vreg2 with r2, because those live range don't overlap.

Then the regalloc part starts, and tries to allocate vreg3.  However, there are no regs to allocate vreg3 
to, and badness ensues.

I think the right way to solve this is to detect badness and break long liveranges tied to physregs (like 
vreg1/2 above).  The problem with this is book-keeping, we'd need to remember where these things 
came from to know how/where to break them.

-Chris
Comment 5 Chris Lattner 2007-01-30 23:16:32 PST
Oh yeah, in case it's not obvious, desired code in this example is something like:

r1 = r1
r2 = r2
spill r2 -> ss#1

r2 = some operation

use r1
restore ss#1 -> r2
use r2

or something.
Comment 6 Anton Korobeynikov 2007-03-20 11:41:06 PDT
Created attachment 709 [details]
Failed bytecode

This is sample bytecode from Qt, which causes llc to cycle. Compile it with llc
-relocation-model=pic. The "bad" function is
_Z9_decOctetPPcP10QByteArrayP9ErrorInfo.
Comment 7 Chris Lattner 2007-05-17 13:59:04 PDT
Evan fixed this a while back.  Qt doesn't require any regparm hacks to build with 2.0.