Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid alignment for SSE2 code #2021

Closed
asl opened this issue Sep 11, 2007 · 11 comments
Closed

Invalid alignment for SSE2 code #2021

asl opened this issue Sep 11, 2007 · 11 comments
Labels
backend:X86 bugzilla Issues migrated from bugzilla

Comments

@asl
Copy link
Collaborator

asl commented Sep 11, 2007

Bugzilla Link 1649
Resolution FIXED
Resolved on Apr 23, 2008 13:45
Version trunk
OS Linux
Attachments Function extracted
CC @tlattner

Extended Description

Consider attached testcase.

In bb79:

bb79: ; preds = %bb57
%tmp81 = load double* %tmp68, align 16 ; [#uses=1]
%tmp82 = sub double -0.000000e+00, %tmp81 ; [#uses=1]
store double %tmp82, double* %tmp68, align 16
%tmp84 = getelementptr [3 x double]* %Raw_Normal, i32 0, i32 1 ; <double*> [#uses=2]
%tmp85 = load double* %tmp84, align 8 ; [#uses=1]
%tmp86 = sub double -0.000000e+00, %tmp85 ; [#uses=1]
store double %tmp86, double* %tmp84, align 8
%tmp88 = getelementptr [3 x double]* %Raw_Normal, i32 0, i32 2 ; <double*> [#uses=2]
%tmp89 = load double* %tmp88, align 8 ; [#uses=1]
%tmp90 = sub double -0.000000e+00, %tmp89 ; [#uses=1]
store double %tmp90, double* %tmp88, align 8
br label %bb92

This is codegen'ed (-disable-fp-elim) into:

    movsd   .LCPI1_0, %xmm0
    movapd  %xmm0, %xmm1
    xorpd   -104(%ebp), %xmm1
    movsd   %xmm1, -104(%ebp)
    movsd   -96(%ebp), %xmm1
    xorpd   %xmm0, %xmm1
    movsd   %xmm1, -96(%ebp)
    movsd   -88(%ebp), %xmm1
    xorpd   %xmm0, %xmm1
    movsd   %xmm1, -88(%ebp)

We definitely see a problem: even if %ebp is 16-bytes aligned, the access of temporaries is not aligned at all.

This causes povray miscompilation at -O2.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 13, 2007

    xorpd   -104(%ebp), %xmm1
    movsd   %xmm1, -104(%ebp)
    movsd   -96(%ebp), %xmm1
    movsd   %xmm1, -96(%ebp)
    movsd   -88(%ebp), %xmm1
    movsd   %xmm1, -88(%ebp)

Only xorpd -104(%ebp) has to be aligned on 16-byte. movsd only require 8-byte alignment.

Furthermore, the prologue looks like this:

    pushl   %ebp
    movl    %esp, %ebp
    pushl   %ebx
    pushl   %edi
    pushl   %esi
    subl    $172, %esp

Assuming %esp was 16-byte aligned before the call. The call push 4-byte and the first pushl %ebp decrement another 4 bytes from %esp. So %ebp = old %esp - 8 = new %esp + 184.

-104(%ebp) is 16-byte aligned. This is correct assuming %esp is 16-byte aligned. But is that true? Linux stack alignment requirement is 8-byte only.

Anton, can you tell me the value of %esp upon entry?

@asl
Copy link
Collaborator Author

asl commented Sep 13, 2007

Hrm, it seems you're right.

%ebp is 0xbfe24bf8 for me :( Any ideas how this can be fixed? Can result of #2008 be used to solve this issue?

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 13, 2007

We need to dynamically align the stack ptr in the prologue. This means using fp to reference objects in the caller's frame and using sp to reference local objects. If alloca() are used, that means keeping another pointer to reference local objects? I am not sure.

I probably won't be getting to this any time soon. Anton, would you like to take a crack at it? :-)

@asl
Copy link
Collaborator Author

asl commented Sep 14, 2007

This seems to be exactly the task of #2008 . So, if we'll have sane stack realignment facility both PRs will be fixed.

I'll try to dig into this (hope really soon).

@asl
Copy link
Collaborator Author

asl commented Sep 15, 2007

*** Bug llvm/llvm-bugzilla-archive#1671 has been marked as a duplicate of this bug. ***

@lattner
Copy link
Collaborator

lattner commented Oct 4, 2007

The GCC folks are discussing stack realignment, some interesting links here:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16660

@asl
Copy link
Collaborator Author

asl commented Oct 4, 2007

@lattner
Copy link
Collaborator

lattner commented Feb 10, 2008

I think this is fixed on mainline, by generating movups instead of movaps when referencing the stack. Is this true, and if so, is this a code quality bug?

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 11, 2008

I don't think this is fixed. I believe we can still generate load from stack slots which are believed to be 16-byte aligned but are not. The problem is it's isel lowering is assuming the stack ptr is properly aligned upon entry into the function. But this is not true on some x86 targets. For those, e.g. cygwin, the stack ptr must be dynamically aligned if 16-byte aligned stack objects are created.

@asl
Copy link
Collaborator Author

asl commented Apr 23, 2008

This should be fixed by series of patches r50152-r50173.

Stack realignment is currently enabled in 'easy' cases, when no dynamic-sized objects are present.

@asl
Copy link
Collaborator Author

asl commented Nov 26, 2021

mentioned in issue llvm/llvm-bugzilla-archive#1671

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 3, 2021
troelsbjerre pushed a commit to troelsbjerre/llvm-project that referenced this issue Jan 10, 2024
[lldb/DWARF] Add support for DW_OP_implicit_value
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

3 participants