Expressions involving undef get collapsed into undef properly, except for PHI nodes. The following code should compile into "ret int undef". Instead, LLVM produces "ret int 0": int f() { int x = 4; int y; if (x == 3) y = 0; return y; }
Verified. This is a missed optimization.
I've looked into this, and it doesn't seem possible to fix in a straight-forward way, and doesn't seem very critical. The problem is that we have this code going into the -mem2reg pass: int %f() { entry: %result = alloca int ; <int*> [#uses=2] %x = alloca int ; <int*> [#uses=2] %y = alloca int ; <int*> [#uses=2] store int 4, int* %x %tmp.0 = load int* %x ; <int> [#uses=1] %tmp.1 = seteq int %tmp.0, 3 ; <bool> [#uses=2] %tmp.2 = cast bool %tmp.1 to int ; <int> [#uses=0] br bool %tmp.1, label %then, label %endif then: ; preds = %entry store int 0, int* %y br label %endif endif: ; preds = %then, %entry %tmp.3 = load int* %y ; <int> [#uses=1] store int %tmp.3, int* %result %tmp.4 = load int* %result ; <int> [#uses=1] ret int %tmp.4 } Because we haven't built SSA form yet, the dead code elimination stuff doesn't see the block is dead before this. When building SSA in mem2reg, it notices it's about to create a phi(undef,0), and simplifies it to 0. This is an important pruning heuristic, but causes it to miss the optimization. I don't see how this enhancement can be done without disabling the heuristic, which would have a negative impact on normal code at the expense of this minor optimization. -Chris
Why not allow mem2reg to produce the phi(undef,x) then add another pass after the DCE which collapses phi(undef,x) to undef? It might make other passes in between mem2reg and DCE slower as the tree would be larger than before, but there would be no negative impact on resulting code, would there?
I moved this minor optzn into the README.txt file: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20080225/059042.html