New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perform substitution when equality is known #1179
Comments
assigned to @nlewycky |
I've implemented my suggested change and it doesn't work. My change updates the Modifying the source to read: causes the example to optimize away as it should. Since that's correct in |
It isn't safe to P->replaceAllUsesWith(Q) or vice versa because replaceAllUsesWith will replace ALL uses Implementing this requires a value range propagation pass. LLVM has two things related to this: lib/Transforms/Scalar/CondPropagate.cpp, is a very simple conditional value propagor, and is stable. Unfortunately there isn't currently an easy place to add this to LLVM. -Chris |
Adds new Propagate Equality pass This finds all branch instructions (only, no switch yet) and finds the set of
This is written to handle multiple false roots for when switch support is Anyways, it handles the test in the bug report correctly, and still passes the |
Duh, that algorithm was incorrect and could replace too many uses, in an ugly |
New propagate equality pass. |
Nifty. Have you evaluated how many times this triggers on various programs? I'd suggest setting up a test to see how often it applies, details here: As a side note, this loop: i->replaceUsesOfWith(Old, New); Also, please s/LLVM Research Group/you/ in the file header. :) -Chris |
New new propagate equality pass. Stats: |
I realize that this probably isn't ready for review (recent bugs etc), but here is some feedback in the The pass as you have it now can be checked into CVS whenever you're happy with it and it is relatively Before this gets turned on by default, I'd like to see the pass get a little bit more general. In particular,
Things that this approach can simplify: X = load P <- from this point on, we know P is non-null, enter P != 0 into the table. if (A != 4) { <- record A != 5 in the table float A, B; etc. There are a large number of simple transformations like this that we can build up with this Are you interested in working on this? If so, the best approach is for you to get the initial version set -Chris |
Other more specific comments about your current pass:
int foo(int A, int B) { |
Now passes llvm-test! There's not that much magical about handling constants. All it does it prevent This also means that it works fine on arguments. In fact, the included testcase Unfortunately, it doesn't seem to be much of a win. Zero replacements performed |
Looks reasonable, some more comments then :) First, you do want to handle arguments somewhat like constants. For example, if you have Arg == 1, This check: can be done once for the entire expression. In particular, in conjunction with your other check, you're Next, you should check phi nodes more aggressively. Consider this: if (A == 7) { here you can change the A in the PHI node with 7, because the use occurs in a dominating block, not Finally, you should check to make sure you're not replacing FP values incorrectly. Specifically, if you -Chris |
As for your more general system idea -- I actually came across this testcase I'm very interested in working on such a pass, though I doubt I could write it I don't really want to go the way of value ranges. That's not expressive enough. And this is there point where I admit that I'm just not entirely sure how that Perhaps I should just start with something simple like EQ and NE, and save the |
I'd be happy to help and to guide.
Optimization really isn't about catching every possibility for canonicalization and simplification, it's
Sounds great, -Chris |
Oh yeah, btw, instcombine knows that things like X*2&1 are always 0, it can track individual bits, inferring -Chris |
First attempt at record-based PropagateEquality pass. So far, it can handle this example: extern void f(); and not much else. |
first functional record-based propequal pass |
I've encountered some problems that warrant design changes. Firstly, it needs to become a two-pass system. The first pass tags the BBs, the The reason is that at the time we see the setcc instruction, the properties Secondly, the EQProperties and NEProperties aren't canonicalized. Suppose you The EQProperty, needs to store the whole equivalence list, not just pairs. Otherwise, it works rather well. It optimizes the original case that started |
I haven't looked in depth at your code (about to do that now) but I don't think that a 2-pass system is
Right. This is actually a good thing, because it will be required for some future extensions. For X = load P We can know that C == false. This case really does occur, for example when compiling the llvm -Chris |
Some feedback on the "first functional record-based propequal pass" pass: You might want to name the pass "CorrelationPropagation.cpp" or something, as eventually we'll be Please add a comment block at the top of the file explaining what the pass does, for example you can Please remove any #includes you can. You don't need Instruction.h (implied by Instructions.h), don't Instead of making Property an inheritance-based class, you could make property be a simple tuple (V1/ Please don't make the multimaps be global. They should be instance vars in the class. Why are you hashing the properties? With the above changes, PropertiesByHash could be implemented Right now your code appears to depend on the order of basic blocks in a function (you use visit runOnFunction() { bool WalkDominatorTree(DTNode *N, InputKnownProperties) { foreach (instruction i in N->getBasicBlock()) if (BBTerminator is branchinst &&
} else { This way, the recursion on the stack naturally handles "pushing and popping" new properties, the visit -Chris |
Oh yeah, PropagateEquality::newConstant is also already available in LLVM. You want: Which does constant folding for you. -Chris |
What if there is a BB with two predecessors, neither of whom dominate it, but It would mean doing a breadth-first traversal of the dominator tree, and each BB |
That boils down to being a question of compile time vs code quality trade off. In practice, I think these -Chris |
second attempt; includes load* ptr, ptr != null. walkDominatorTree got refactored into oblivion. You'll find parts of it named There's quite a bit of redundancy between the "addProperty" and "resolve" %A = seteq %x, 4 When the visitBranchInst calls addProperty, gives it Property(%C, true, ==). However, there are some cases this misses. "resolve" runs over every operand of %A = seteq %x, 4 Because %B is defined outside of the block with the property, it can't be As for performance, my tests run it with -mem2reg -instcombine -propequal, and |
I'm considering a different design. I realized that the == property is different Further, because checking for substitutes for variables is so common (once per As I'm still working on the implementation, it remains to be seen whether this |
collapse equivalent variables; use std::map for fast lookup The included feature tests aren't quite exhaustive, and some of them can be I'd appreciate any review, comments, testing and bug reports. :) |
fix KPcopy usage |
This optimization pass has since been enabled. |
Extended Description
The following code doesn't optimize down:
// Source: pag.csail.mit.edu/paste2005/slides/yang-paste2005-slides.ppt
extern void foo();
void process(int *p, int *q)
{
if (p != q)
return;
if (*p != *q)
foo(); // unreachable
}
In a branch whose condition depends on a EQ or NE, in the side where the
variables are equal (and if the types are equal (?)), replaceAllUsesOfWith(p,
q). This allows the compiler to optimize away the following test and should kill
the liveness of one of the variables. I suppose you could also do the same
optimization with a switch instruction.
Where would such an optimization belong? The instruction combiner?
The text was updated successfully, but these errors were encountered: