New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
redundant load not removed #10664
Comments
We can handle a slightly simpler case: define void @test2(%zed* nocapture %bar) { What makes us handle it is the logic:
In the original testcase it fails because O2 is %tmp2 = load %zed** %bar, align 8 and %bar is the argument. Can BasicAliasAnalysis use memdep to find that no store could have set the result of that load to be a local (alloca or no alias call) value? |
Just some notes: *) For allocas (but not non alias calls), we just need to consider the entry bb if we don't care about dynamic allocas (which we probably don't) *) On this case at least, the check can be as simple as: before the first instruction in the entry bb that writes to memory. I am currently reducing the original c++ testcase too to see if these rules still apply and if there is anything that clang could do to make llvm's task easier. |
The C++ testcase: class FrameRegs { |
The code clang generates looks pretty reasonable, the two loads are just the two stack->seg_. My current idea is to add pointsTNonLocalMemory to the alias analysis interface and add a really simple implementation to the basic alias analysis. Sounds reasonable? |
proposed patch |
new patch |
fixed patch |
rdar://13017143 |
Still an issue: https://godbolt.org/z/4v84v6 I think we should be able to weed out such loads at the end of MemorySSA-backed DSE. |
Extended Description
In the following example, tmp5 could be removed.
%zed = type { i8* }
define void @test(%zed** %bar) {
%tmp1 = alloca i8, align 8
%tmp2 = load %zed** %bar, align 8
%tmp3 = getelementptr inbounds %zed* %tmp2, i64 0, i32 0
%tmp4 = load i8** %tmp3, align 8
call void @llvm.memcpy.p0i8.p0i8.i64(i8* %tmp1, i8* %tmp4, i64 24, i32 8, i1 false)
%tmp5 = load i8** %tmp3, align 8
call void @foobar(i8* %tmp1)
call void @foobar(i8* %tmp5)
ret void
}
declare void @foobar(i8*)
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture, i64, i32, i1) nounwind
The text was updated successfully, but these errors were encountered: