Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[linscan] Coallescing physregs with virtregs should allow spilling the virtreg portion #1083

Closed
lattner opened this issue Feb 23, 2006 · 7 comments
Labels
bugzilla Issues migrated from bugzilla compile-fail Use [accepts-invalid] and [rejects-valid] instead llvm:codegen

Comments

@lattner
Copy link
Collaborator

lattner commented Feb 23, 2006

Bugzilla Link 711
Resolution FIXED
Resolved on Feb 22, 2010 12:55
Version 1.0
OS All
Blocks #1071
CC @asl

Extended Description

This problem is causing fastcc on x86 to fail in some corner cases. Here's a testcase:


target endian = little
target pointersize = 32
target triple = "i686-pc-linux-gnu"
%.str_28 = external global [24 x sbyte] ; <[24 x sbyte]*> [#uses=1]

implementation ; Functions:

fastcc void %apply_stencil_op_to_pixels(int* %tmp.414, uint %n,
int* %x, int* %y, uint %oper, ubyte* %mask) {
entry:
switch uint %oper, label %label.5 [
uint 5386, label %label.6
]

label.5: ; preds = %entry
br label %no_exit.8

no_exit.8: ; preds = %then.17, %label.5
%tmp.409 = getelementptr ubyte* %mask, uint 0 ; <ubyte*> [#uses=1]
%tmp.410 = load ubyte* %tmp.409 ; [#uses=0]
br label %then.17

then.17: ; preds = %no_exit.8
%tmp.415 = load int* %tmp.414
%tmp.417 = load ubyte** null ; <ubyte*> [#uses=1]
%tmp.425 = getelementptr int* %y, uint 0 ; <int*> [#uses=1]
%tmp.426 = load int* %tmp.425 ; [#uses=1]
%tmp.437 = getelementptr ubyte* %tmp.417, int %tmp.426
; <ubyte*> [#uses=2]
%tmp.440 = load ubyte* %tmp.437 ; [#uses=1]
store ubyte %tmp.440, ubyte* %tmp.437
%tmp.405101 = setlt uint 0, %n ; [#uses=1]
br bool %tmp.405101, label %no_exit.8, label %UnifiedReturnBlock

label.6: ; preds = %entry
%tmp.4.i = tail call uint %fwrite( sbyte* getelementptr ([24 x sbyte]* %.str_28, int 0, int 0), uint 23,
uint 1, sbyte* null ) ; [#uses=0]
ret void

UnifiedReturnBlock: ; preds = %then.17, %no_exit.8
ret void
}

declare int %fprintf(%struct._IO_FILE*, sbyte*, ...)

declare uint %fwrite(sbyte*, uint, uint, sbyte*)

Compiled with 'llc -enable-x86-fastcc', this crashes llc.

This is due to the coallescer coallescing virtregs with both EAX and EDX, which makes them unavailable
to satisfy spills, causing the RA to run out of registers. We want to coallesce physregs when possible,
but we cannot pin them in the spiller: we have to be able to uncoallesce them.

This is almost certainly related to Bug 699.

-Chris

@asl
Copy link
Collaborator

asl commented Jan 29, 2007

Now with inreg patch landed, it seems, that this bug is now more important. For
example, this (currently) breaks Qt:

Consider we're having regparm(3) function being compiled in PIC mode on
x86/Linux. So, in general, eax, ebx, ecx & edx are used in the early entry of
function. It seems, that register allocator can't handle such corner situation.
llc -debug shows, that is just "run out of registers" and after go into infinite
cycle. Compiling the same function in non-PIC mode (so, making ebx free) will
allow register allocator to correctly handle this case.

Cheap workaround is just lower regparm(3) to regparm(2). But, it seems, many
libraries are using stdcall + regparm(3) as "fast" CC and it will be definitely
better to support this. At least, infinite cycling is not good :)

I can provide additional .ll, which causes llc to cycle.

@lattner
Copy link
Collaborator Author

lattner commented Jan 29, 2007

Evan, can you think of a reasonably simple way to work around this problem in the short-term?

-Chris

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 31, 2007

I have to spend some time (which I don't have right now) to figure out a fix. Do
you suppose it's easy for the coalescer to detect that a liverange has been
coalesced to more than one physical register?

@lattner
Copy link
Collaborator Author

lattner commented Jan 31, 2007

I don't think that's the issue. Consider a two register machine with code like this:

vreg1 = r1
vreg2 = r2

vreg3 = some operation

use vreg1
use vreg2

Right now, the coallescer will coallesce vreg1 with r1 because the live ranges don't overlap. Then it
coallesces vreg2 with r2, because those live range don't overlap.

Then the regalloc part starts, and tries to allocate vreg3. However, there are no regs to allocate vreg3
to, and badness ensues.

I think the right way to solve this is to detect badness and break long liveranges tied to physregs (like
vreg1/2 above). The problem with this is book-keeping, we'd need to remember where these things
came from to know how/where to break them.

-Chris

@lattner
Copy link
Collaborator Author

lattner commented Jan 31, 2007

Oh yeah, in case it's not obvious, desired code in this example is something like:

r1 = r1
r2 = r2
spill r2 -> ss#1

r2 = some operation

use r1
restore ss#1 -> r2
use r2

or something.

@asl
Copy link
Collaborator

asl commented Mar 20, 2007

Failed bytecode
This is sample bytecode from Qt, which causes llc to cycle. Compile it with llc
-relocation-model=pic. The "bad" function is
_Z9_decOctetPPcP10QByteArrayP9ErrorInfo.

@lattner
Copy link
Collaborator Author

lattner commented May 17, 2007

Evan fixed this a while back. Qt doesn't require any regparm hacks to build with 2.0.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 3, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla compile-fail Use [accepts-invalid] and [rejects-valid] instead llvm:codegen
Projects
None yet
Development

No branches or pull requests

3 participants