LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 4323 - Tail recursion elimination doesn't move loads above nounwind readonly call
Summary: Tail recursion elimination doesn't move loads above nounwind readonly call
Status: RESOLVED FIXED
Alias: None
Product: new-bugs
Classification: Unclassified
Component: new bugs (show other bugs)
Version: unspecified
Hardware: All All
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-06-04 08:48 PDT by Frits van Bommel
Modified: 2009-06-18 23:23 PDT (History)
4 users (show)

See Also:
Fixed By Commit(s):


Attachments
Patch to check CI->mayHaveSideEffects() when looking at a load. (841 bytes, patch)
2009-06-04 08:48 PDT, Frits van Bommel
Details
Fixed patch (840 bytes, patch)
2009-06-04 09:31 PDT, Frits van Bommel
Details
A more comprehensive patch (12.94 KB, patch)
2009-06-05 10:50 PDT, Frits van Bommel
Details
Patch with just isSafeToLoadUnconditionally() changes. (5.26 KB, patch)
2009-06-16 09:47 PDT, Frits van Bommel
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Frits van Bommel 2009-06-04 08:48:24 PDT
Created attachment 3061 [details]
Patch to check CI->mayHaveSideEffects() when looking at a load.

opt -tailcallelim refuses to eliminate a tail call if it's followed by load, even if the call is both nounwind and readonly.

An example where it would be safe to do this:
----
define fastcc i32 @_D4test3sumFPiiiZi(i32* %a_arg, i32 %a_len_arg, i32 %start_arg) nounwind readonly {
entry:
	%tmp2 = icmp slt i32 %start_arg, %a_len_arg		; <i1> [#uses=1]
	br i1 %tmp2, label %else, label %if

if:		; preds = %entry
	ret i32 0

else:		; preds = %entry
	%tmp4 = sext i32 %start_arg to i64		; <i64> [#uses=1]
	%tmp6 = getelementptr i32* %a_arg, i64 %tmp4		; <i32*> [#uses=1]
	%tmp10 = add i32 %start_arg, 1		; <i32> [#uses=1]
	%tmp11 = tail call fastcc i32 @_D4test3sumFPiiiZi(i32* %a_arg, i32 %a_len_arg, i32 %tmp10)		; <i32> [#uses=1]
	%tmp12 = load i32* %tmp6		; <i32> [#uses=1]
	%tmp13 = add i32 %tmp12, %tmp11		; <i32> [#uses=1]
	ret i32 %tmp13
}
----

I'm attaching a small patch that remedies the situation.
Note that I used CI->mayHaveSideEffects() instead of CI->onlyReadsMemory() to correctly handle the case where the call may throw and the load may introduce undefined behavior[1]. This way, the transformation does not take place in that case.


[1]: For example, if the pointer being loaded from may be null but the call will throw in this case it's not safe to move the load before the call since that would produce undefined behavior instead of an exception.
Comment 1 Duncan Sands 2009-06-04 08:59:06 PDT
If the call unwinds then the load is never executed and there is no
undefined behaviour: unwinding continues in the caller.
Comment 2 Frits van Bommel 2009-06-04 09:31:39 PDT
Created attachment 3062 [details]
Fixed patch

Indeed, but the check in the pass is if it's safe to /move/ the instruction to /before/ the call, so that it can eliminate a tail recursion.

It currently does not do this at all if the instruction is a load, but it should be safe if the function does not unwind and does not write to memory.

... and I just noticed I had the check reversed. Attaching corrected patch.



For reference, this is the case I was talking about:
----
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128"
target triple = "x86_64-unknown-linux-gnu"

define i32 @_Z3fooPiii(i32* %p, i32 %i, i32 %max) readonly {
entry:
	%tmp4 = icmp slt i32 %i, %max		; <i1> [#uses=1]
	br i1 %tmp4, label %return, label %bb1

bb1:		; preds = %entry
	%tmp6 = icmp eq i32* %p, null		; <i1> [#uses=1]
	br i1 %tmp6, label %bb2, label %bb3

bb2:		; preds = %bb1
	unwind

bb3:		; preds = %bb1
	%tmp13 = sext i32 %i to i64		; <i64> [#uses=1]
	%tmp17 = add i32 %i, 1		; <i32> [#uses=1]
	%tmp20 = call i32 @_Z3fooPiii(i32* %p, i32 %tmp17, i32 %max)		; <i32> [#uses=1]
	%tmp15 = load i32* %p, align 1		; <i32> [#uses=1]
	%tmp21 = add i32 %tmp15, %tmp20		; <i32> [#uses=1]
	ret i32 %tmp21

return:		; preds = %entry
	ret i32 0
}
----
In this case, it's *not* safe to perform the transformation, and so the fixed patch does not.
Comment 3 Duncan Sands 2009-06-04 09:55:39 PDT
Isn't it safe to move the load before the call if the load can be
proved not to trap?
Comment 4 Frits van Bommel 2009-06-04 10:30:22 PDT
Sure. But I don't see an easy way to determine that. My proposed change would certainly not be the most precise fix, but it's an improvement over the current situation :).

(You could also move the load if it is non-trapping and the call writes to memory, but not *that* memory. However, that would require adding an alias analysis dependency to the pass, and I wasn't sure if that would be accepted)
Comment 5 Duncan Sands 2009-06-04 14:36:37 PDT
Instcombine has isSafeToLoadUnconditionally.  You could factor
that out and use it here.  On the other hand, maybe if the
load could be safely moved it already would have been by some
pass like instcombine?
Comment 6 Frits van Bommel 2009-06-04 19:49:34 PDT
(In reply to comment #5)
> On the other hand, maybe if the
> load could be safely moved it already would have been by some
> pass like instcombine?

I doubt another pass would see the benefit of doing so. The only reason it's profitable is because it allows the elimination of tail recursion.
Comment 7 Frits van Bommel 2009-06-05 10:50:05 PDT
Created attachment 3068 [details]
A more comprehensive patch

This patch
* moves instcombine's isSafeToLoadUnconditionally to Transforms/Utils/Local.cpp (the best place for it that I could find, but I'm open to suggestions).
* tweaks it a bit to ignore readonly calls when scanning backwards, since those aren't allowed to call free (right?).
* uses it in tailcallelim to allow more loads to be moved above the call.

Passes 'make check', including the new test cases included in the patch.
Comment 8 Chris Lattner 2009-06-16 00:31:17 PDT
The patch to move and enhance isSafeToLoadUnconditionally looks great to me, please commit it as a separate change.
Comment 9 Chris Lattner 2009-06-16 00:36:20 PDT
Once that is done, please send the proposed patch to llvm-commits where I'll see it.  It would be nice to simplify the conditional logic somehow if possible.
Comment 10 Frits van Bommel 2009-06-16 09:47:37 PDT
Created attachment 3098 [details]
Patch with just isSafeToLoadUnconditionally() changes.

(In reply to comment #8)
> The patch to move and enhance isSafeToLoadUnconditionally looks great to me,
> please commit it as a separate change.

See new attachment. I don't have commit access, so someone else will have to commit it.
Comment 11 Chris Lattner 2009-06-16 12:23:51 PDT
Applied, I also added a testcase.  Please make sure any optimizer enhancement includes a small testcase (e.g. r73508).
Comment 12 Frits van Bommel 2009-06-16 16:38:38 PDT
(In reply to comment #11)
> Applied, I also added a testcase.  Please make sure any optimizer enhancement
> includes a small testcase (e.g. r73508).

Sorry, I only had testcases for the tailcallelim change. I didn't really study what instcombine was trying to do there, I just stole a handy function and modified it to make it work for this case :).
(And I didn't think about this when I split up the patch)


(In reply to comment #9)
> Once that is done, please send the proposed patch to llvm-commits where I'll
> see it.

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090615/078834.html
Comment 13 Chris Lattner 2009-06-18 23:23:34 PDT
Patch applied here, thanks!
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090615/079051.html