-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scalarrepl should be able to scalarrepl aggregates with memcpy uses #1598
Comments
Repro with: |
Here's a reduced testcase: #include <string.h>
implementation ; Functions: define i32 @test(%struct.foo* %P) { |
This patch contains the (disabled) code to do the SROA. Before this can be enabled, mem2reg needs to be http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070305/045540.html -Chris |
Second half committed here: Third 'half' will come later. |
For my notes: The main part of this is implemented, but the testcase in this bug is not yet implemented. The reason
Another testcase: #include <string.h> struct foo { int A, B; }; struct bar{ struct foo x; long long y; double D; }; int test1(struct foo *P) { int test2() { int test3() { B.x.A = 1; -Chris |
Here is the final piece: Testcases here: -Chris |
mentioned in issue #824 |
Teach SROA to handle allocas with more than one dbg.declare.
Bug 1598 was due to a call's implicit return value not being captured. Fixes llvm#1598 Co-authored-by: Lukas Diekmann <lukas.diekmann@gmail.com>
Extended Description
Consider:
#include <tr1/functional>
#include
void assign( long* variable, long v) {
std::transform( variable, variable + 1, variable,
std::tr1::bind( std::plus< long >(), 0L, v ) );
}
This compiles to a single store on x86, but a whole ton of code on x86-64. This is because the
temporary structs are larger on x86-64, so EmitAggregateCopy in llvm-gcc emits them as a memcpy
instead of scalar transfers.
The problem is that this later blocks scalarrepl from promoting the structs, causing much worse
codegen:
__Z6assignRll: # x86-32
movl 8(%esp), %eax
movl 4(%esp), %ecx
movl %eax, (%ecx)
ret
__Z6assignRll: # x86-64
subq $88, %rsp
movb $0, 64(%rsp)
movq $0, 72(%rsp)
movq %rsi, 80(%rsp)
movq %rsi, 48(%rsp)
movq 72(%rsp), %rax
movq %rax, 40(%rsp)
movq 64(%rsp), %rax
movq %rax, 32(%rsp)
movq 40(%rsp), %rax
movq %rax, 8(%rsp)
movq 48(%rsp), %rax
movq %rax, 16(%rsp)
movq 32(%rsp), %rax
movq %rax, (%rsp)
movq 16(%rsp), %rax
addq 8(%rsp), %rax
movq %rax, (%rdi)
addq $88, %rsp
ret
-Chris
The text was updated successfully, but these errors were encountered: