-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tail recursion elimination doesn't move loads above nounwind readonly call #4695
Comments
If the call unwinds then the load is never executed and there is no |
Fixed patch It currently does not do this at all if the instruction is a load, but it should be safe if the function does not unwind and does not write to memory. ... and I just noticed I had the check reversed. Attaching corrected patch. For reference, this is the case I was talking about:target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128" define i32 @_Z3fooPiii(i32* %p, i32 %i, i32 %max) readonly { bb1: ; preds = %entry bb2: ; preds = %bb1 bb3: ; preds = %bb1 return: ; preds = %entry
|
Isn't it safe to move the load before the call if the load can be |
Sure. But I don't see an easy way to determine that. My proposed change would certainly not be the most precise fix, but it's an improvement over the current situation :). (You could also move the load if it is non-trapping and the call writes to memory, but not that memory. However, that would require adding an alias analysis dependency to the pass, and I wasn't sure if that would be accepted) |
Instcombine has isSafeToLoadUnconditionally. You could factor |
I doubt another pass would see the benefit of doing so. The only reason it's profitable is because it allows the elimination of tail recursion. |
A more comprehensive patch
Passes 'make check', including the new test cases included in the patch. |
The patch to move and enhance isSafeToLoadUnconditionally looks great to me, please commit it as a separate change. |
Once that is done, please send the proposed patch to llvm-commits where I'll see it. It would be nice to simplify the conditional logic somehow if possible. |
Patch with just isSafeToLoadUnconditionally() changes.
See new attachment. I don't have commit access, so someone else will have to commit it. |
Applied, I also added a testcase. Please make sure any optimizer enhancement includes a small testcase (e.g. r73508). |
Sorry, I only had testcases for the tailcallelim change. I didn't really study what instcombine was trying to do there, I just stole a handy function and modified it to make it work for this case :).
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090615/078834.html |
Patch applied here, thanks! |
Revert "Skip tracking for async entry values when processing incoming BB locs. "
Extended Description
opt -tailcallelim refuses to eliminate a tail call if it's followed by load, even if the call is both nounwind and readonly.
An example where it would be safe to do this:
define fastcc i32 @_D4test3sumFPiiiZi(i32* %a_arg, i32 %a_len_arg, i32 %start_arg) nounwind readonly {
entry:
%tmp2 = icmp slt i32 %start_arg, %a_len_arg ; [#uses=1]
br i1 %tmp2, label %else, label %if
if: ; preds = %entry
ret i32 0
else: ; preds = %entry
%tmp4 = sext i32 %start_arg to i64 ; [#uses=1]
%tmp6 = getelementptr i32* %a_arg, i64 %tmp4 ; <i32*> [#uses=1]
%tmp10 = add i32 %start_arg, 1 ; [#uses=1]
%tmp11 = tail call fastcc i32 @_D4test3sumFPiiiZi(i32* %a_arg, i32 %a_len_arg, i32 %tmp10) ; [#uses=1]
%tmp12 = load i32* %tmp6 ; [#uses=1]
%tmp13 = add i32 %tmp12, %tmp11 ; [#uses=1]
ret i32 %tmp13
}
I'm attaching a small patch that remedies the situation.
Note that I used CI->mayHaveSideEffects() instead of CI->onlyReadsMemory() to correctly handle the case where the call may throw and the load may introduce undefined behavior[1]. This way, the transformation does not take place in that case.
[1]: For example, if the pointer being loaded from may be null but the call will throw in this case it's not safe to move the load before the call since that would produce undefined behavior instead of an exception.
The text was updated successfully, but these errors were encountered: