scalarrepl should be able to scalarrepl aggregates with memcpy uses #1598

lattner · 2007-02-25T22:35:25Z


Bugzilla Link	1226
Resolution	FIXED
Resolved on	Feb 22, 2010 12:56
Version	1.0
OS	All
Blocks	#824

Extended Description

Consider:

#include <tr1/functional>
#include
void assign( long* variable, long v) {
std::transform( variable, variable + 1, variable,
std::tr1::bind( std::plus< long >(), 0L, v ) );
}

This compiles to a single store on x86, but a whole ton of code on x86-64. This is because the
temporary structs are larger on x86-64, so EmitAggregateCopy in llvm-gcc emits them as a memcpy
instead of scalar transfers.

The problem is that this later blocks scalarrepl from promoting the structs, causing much worse
codegen:

__Z6assignRll: # x86-32
movl 8(%esp), %eax
movl 4(%esp), %ecx
movl %eax, (%ecx)
ret

__Z6assignRll: # x86-64
subq $88, %rsp
movb $0, 64(%rsp)
movq $0, 72(%rsp)
movq %rsi, 80(%rsp)
movq %rsi, 48(%rsp)
movq 72(%rsp), %rax
movq %rax, 40(%rsp)
movq 64(%rsp), %rax
movq %rax, 32(%rsp)
movq 40(%rsp), %rax
movq %rax, 8(%rsp)
movq 48(%rsp), %rax
movq %rax, 16(%rsp)
movq 32(%rsp), %rax
movq %rax, (%rsp)
movq 16(%rsp), %rax
addq 8(%rsp), %rax
movq %rax, (%rdi)
addq $88, %rsp
ret

-Chris

lattner · 2007-02-25T22:36:28Z

Repro with:
$ llvm-g++ t.cc -O3 -S -o - -fno-exceptions -fomit-frame-pointer -m64

lattner · 2007-03-05T02:25:26Z

Here's a reduced testcase:

#include <string.h>
struct foo { int A, B; };
int test(struct foo *P) {
struct foo L;
memcpy(&L, P, sizeof(struct foo));
return L.A;
}

    %struct.foo = type { i32, i32 }

implementation ; Functions:

define i32 @test(%struct.foo* %P) {
entry:
%L = alloca %struct.foo, align 8 ; <%struct.foo*> [#uses=2]
%L2 = bitcast %struct.foo* %L to i8* ; <i8*> [#uses=1]
%tmp13 = bitcast %struct.foo* %P to i8* ; <i8*> [#uses=1]
call void @llvm.memcpy.i32( i8* %L2, i8* %tmp13, i32 8, i32 4 )
%tmp4 = getelementptr %struct.foo* %L, i32 0, i32 0 ; <i32*> [#uses=1]
%tmp5 = load i32* %tmp4 ; [#uses=1]
ret i32 %tmp5
}

lattner · 2007-03-05T09:55:29Z

This patch contains the (disabled) code to do the SROA. Before this can be enabled, mem2reg needs to be
able to promote scalars targetted by memcpy/memset etc.

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070305/045540.html

-Chris

lattner · 2007-03-08T08:56:15Z

Second half committed here:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070305/045718.html

Third 'half' will come later.

lattner · 2007-03-08T09:09:45Z

For my notes:

The main part of this is implemented, but the testcase in this bug is not yet implemented. The reason
for this is that the lowered structure starts out with an array of bytes, so we get a memcpy ("gep x, 0,
0") instead of memcpy(bitcast). Two issues:

Make xform work with GEP.
Figure out why we're getting the array of bytes in this trivial case.

Another testcase:

#include <string.h>

struct foo { int A, B; };

struct bar{ struct foo x; long long y; double D; };

int test1(struct foo *P) {
struct foo L;
memcpy(&L, P, sizeof(struct foo));
return L.A;
}

int test2() {
struct foo L[4];
memset(L, 0, sizeof(struct foo)*4);
return L[0].A;
}

int test3() {
struct bar B;
memset(&B, 1, sizeof(B));

B.x.A = 1;
B.D = 10.0;
return B.x.B;
}

-Chris

lattner · 2007-03-19T02:19:56Z

Here is the final piece:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070312/045911.html

Testcases here:
Transforms/ScalarRepl/memset-aggregate.ll
Transforms/ScalarRepl/memset-aggregate-byte-leader.ll

-Chris

llvmbot · 2021-11-27T03:00:02Z

mentioned in issue #824

Teach SROA to handle allocas with more than one dbg.declare.

Bug 1598 was due to a call's implicit return value not being captured. Fixes llvm#1598 Co-authored-by: Lukas Diekmann <lukas.diekmann@gmail.com>

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 3, 2021

vext01 mentioned this issue Feb 11, 2025

Fix defs ykjit/ykllvm#226

Merged

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scalarrepl should be able to scalarrepl aggregates with memcpy uses #1598

scalarrepl should be able to scalarrepl aggregates with memcpy uses #1598

lattner commented Feb 25, 2007

lattner commented Feb 25, 2007

lattner commented Mar 5, 2007

lattner commented Mar 5, 2007

lattner commented Mar 8, 2007

lattner commented Mar 8, 2007

lattner commented Mar 19, 2007

llvmbot commented Nov 27, 2021

scalarrepl should be able to scalarrepl aggregates with memcpy uses #1598

scalarrepl should be able to scalarrepl aggregates with memcpy uses #1598

Comments

lattner commented Feb 25, 2007

Extended Description

lattner commented Feb 25, 2007

lattner commented Mar 5, 2007

lattner commented Mar 5, 2007

lattner commented Mar 8, 2007

lattner commented Mar 8, 2007

lattner commented Mar 19, 2007

llvmbot commented Nov 27, 2021