LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 31362 - ms_abi is implemented incorrectly for larger values (>=16 bytes)
Summary: ms_abi is implemented incorrectly for larger values (>=16 bytes)
Status: RESOLVED FIXED
Alias: None
Product: clang
Classification: Unclassified
Component: -New Bugs (show other bugs)
Version: 3.9
Hardware: PC Linux
: P normal
Assignee: Unassigned Clang Bugs
URL:
Keywords:
: 29143 33834 35814 (view as bug list)
Depends on:
Blocks: release-6.0
  Show dependency tree
 
Reported: 2016-12-13 12:58 PST by simonas+llvm.org
Modified: 2018-03-30 09:05 PDT (History)
9 users (show)

See Also:
Fixed By Commit(s): rL324594


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description simonas+llvm.org 2016-12-13 12:58:27 PST
```c
#include <inttypes.h>
struct i128 { uint64_t a; uint64_t b; };

__attribute__((ms_abi, noinline)) struct i128 passthrough_a_s(struct i128 a) {
    return a;
}

__attribute__((ms_abi, noinline)) __int128 passthrough_a(__int128 a) {
    return a;
}
```

This code compiles to assembly that looks like this:


```asm
passthrough_a_s:
        mov     rax, rcx
        ret

passthrough_a:
        mov     rax, rcx
        ret
```

As per these two documents:

[msdn1]: https://msdn.microsoft.com/en-us/library/zthk2dkh.aspx
[msdn2]: https://msdn.microsoft.com/en-us/library/7572ztz4.aspx

Both of these are wrong and the generated assembly ought to look like this instead:


```asm
passthrough_a_s:
        mov     rax, rcx
        mov     r9, QWORD PTR [rdx]
        mov     r10, QWORD PTR [rdx+8]
        mov     QWORD PTR [rcx], r9
        mov     QWORD PTR [rcx+8], r10
        ret

passthrough_a:
        mov     rax, rcx
        mov     r9, QWORD PTR [rdx]
        mov     r10, QWORD PTR [rdx+8]
        mov     QWORD PTR [rcx], r9
        mov     QWORD PTR [rcx+8], r10
        ret
```

The relevant excerpts:

> A scalar return value that can fit into 64 bits is returned through RAX—this includes __m64 types. Non-scalar types including floats, doubles, and vector types such as __m128, __m128i, __m128d are returned in XMM0. [snip] To be returned by value in RAX, user-defined types must have a length of 1, 2, 4, 8, 16, 32, or 64 bits. [snip] Otherwise, the caller assumes the responsibility of allocating memory and passing a pointer for the return value as the first argument.

> __m128 types, arrays and strings are never passed by immediate value but rather a pointer is passed to memory allocated by the caller. Structs/unions of size 8, 16, 32, or 64 bits and __m64 are passed as if they were integers of the same size. Structs/unions other than these sizes are passed as a pointer to memory allocated by the caller. For these aggregate types passed as a pointer (including __m128), the caller-allocated temporary memory will be 16-byte aligned.

Also from https://msdn.microsoft.com/en-us/library/ms235286.aspx:

> There is no attempt to spread a single argument across multiple registers.
Comment 1 Reid Kleckner 2018-01-17 14:34:28 PST
*** Bug 33834 has been marked as a duplicate of this bug. ***
Comment 2 Reid Kleckner 2018-01-17 14:34:43 PST
*** Bug 29143 has been marked as a duplicate of this bug. ***
Comment 3 Reid Kleckner 2018-01-17 14:36:46 PST
*** Bug 35814 has been marked as a duplicate of this bug. ***
Comment 4 Hans Wennborg 2018-01-18 02:57:08 PST
Issue 35814 was tentatively marked as a release blocker, so let's mark this as such too. I don't know if anyone will have time to look into it though.
Comment 5 Dimitry Andric 2018-01-18 03:26:21 PST
(In reply to Hans Wennborg from comment #4)
> Issue 35814 was tentatively marked as a release blocker, so let's mark this
> as such too. I don't know if anyone will have time to look into it though.

Bug 35814 was specifically about clang now asserting after https://reviews.llvm.org/rL316416.  It would be nice to fix at least that, somehow?  Not sure if reverting r316416 is feasible, though.
Comment 6 Mateusz Belicki 2018-01-26 07:21:53 PST
I plan to look into this issue.

Currently my understanding is that win64cc was never fully implemented (byval returns and arguments are not handled as specified in the convention). The solution seems to be to modify X86ISelLowering to correctly spill/fill byval arguments to addresses passed in RCX, RDX, R8 and R9 instead of stack if this CC is used.

With byvals loaded and stored correctly the original problem form Bug 35814 should disappear as the X86CallFrameOptimization will not be removing the stores and loads anymore.
Comment 7 Mateusz Belicki 2018-02-07 05:03:27 PST
After investigation I reached conclusion that issue is caused by Clang.

When clang is lowering function into IR on platform other than Windows standard x86_64 ABI rules are used with disregard to what is specified using __attribute__. This is caused by the fact that only compilation target is checked when when choosing ABI, calling convention of lowered function (which may be overwritten by attribute) is not checked. This caused clang to split 128-bit byval into 64-bit arguments when emiting IR. Even though the information about CC is available to the LLVM code generator there is no indication that the arguments were originally forming single byval argument, so code gen is not able to produce correct binary in this case.

I've prepared a patch that is fixing this issue: https://reviews.llvm.org/D43016
Comment 8 Dimitry Andric 2018-02-08 08:14:29 PST
There's now a fix landed in https://reviews.llvm.org/rL324594, it at least fixes the assertion from bug 35814, but can the original submitter of this bug also please independently verify that it fixes their issue?

We should merge this as soon as it has baked a little in trunk, Hans, do you agree?
Comment 9 Hans Wennborg 2018-02-09 01:01:50 PST
(In reply to Dimitry Andric from comment #8)
> There's now a fix landed in https://reviews.llvm.org/rL324594, it at least
> fixes the assertion from bug 35814, but can the original submitter of this
> bug also please independently verify that it fixes their issue?
> 
> We should merge this as soon as it has baked a little in trunk, Hans, do you
> agree?

Sounds good to me. Merged in r324718.