New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Polly] llvm.expect cannot be code generated #31671
Comments
Test case demonstrating problem with icmp statement The problem seems not to lie with code generation for llvm.expect but with the "icmp" statement that the llvm.expect has a dependency on. This is actually confirmed by a review of the code as, SCEV cannot model the "icmp" statement. In file lib/Support/SCEVAffinator.cpp, we see SCEV modelling for all IR statements including zext, add, div, etc. There is a design decision to be taken here. The choice is:
|
Even simpler test case %tmp3 = icmp ult i64 4, 12 Variant A breaks, variant B does not. The issue seems to be during expression generation. In Variant A, @llvm.expect and the zext expression are generated at the top of the region, instead of the basic block where they have been before. That looks wrong and likely needs to be corrected. Variant B get's it right. |
@tobias sir. In the second test case, AFAIK, what is happening is that since %tmp5 is not being used anywhere, Polly discards CodeGeneration for the statement at line 20 (The statement containing the llvm.expect call). This can be confirmed with the fact that the function "trySynthesizeNewValue()" in lib / CodeGen / BlockGenerators.cpp is not invoked in the execution for this program, whereas it is called in the original test case and also in the one that I submitted earlier. I am trying to understand the trySynthesizeNewValue() function. Also, where and why Polly performs copyBB and splitBB. |
trySynthesizeNewValue() is called with the @llvm.expect instruction as parameter Old. It determines the correct insert point using Builder.GetInsertPoint() - which is the br statement at the end of polly.stmt.bb2 - this is the desired value. After this, expandCodeFor is called to get the expanded statement. In this function, although the correct insert point is passed, the insertion is hoisted to a higher loop nest. This is performed because the instruction is loop invariant. This, infact, is part of llvm's SCEV expander. Code can be found in the file - llvm / lib / Analysis / ScalarEvolutionExpander.cpp - function SCEVExpander::expand(). The weird part was, when I removed llvm.expect from the list of ignored intrinsics, in the file - polly / lib / Support / ScopHelper.cpp - function isIgnoredInstrinsic() , it passed all regression tests, and eliminated this error as well. It copied the basic block in the right place as shown (I debug printed copyBB) - ############## copyBB is complete ################ polly.stmt.bb2: ; preds = %polly.stmt.bb1 ############## copyBB is complete ################ I am not sure if changing the SCEVExpander is the right thing to do. Please let me know your thoughts. Thanks :) |
Hi Annanay, I am impressed. Without asking a lot of further questions, you are clearly capable of digging your way through the code, understanding the issues, and proposing sensible solutions! Very nice, indeed! Now to the technical discussion: I am surprised, that having llvm.expect in the ignored intrinsics fixes this issue. If these indeed fixes the problem, we should take it. However, beforehand we should understand why this works. I would assume llvm.expect does not have any side effects and would consequently be ignored anyhow. Hence, it likely does not need to be in the ignoredIntrinsic's list for the scop to be detected. If all tests pass and your test case still generates Polly code during code generation, this is true. Can you confirm this? Having an intrinsic in isIgnoredIntrinics means for code generation, that we decide to not generate code for llvm.expect(). Hence, by removing llvm.expect from isIgnoredIntrinsic, we ask the code generator to not ignore it when generating the basic block. This seems to be OK, as the llvm.expect() should still hold after Polly rescheduled a piece of code. I am not 100% clear why this fixes the code generation issue though. The direct impact is that the return value of llvm.expect will appear in the BBList, which is then likely used instead of inserting a new copy higher up in the program. Can you investigate what exactly happens here. The last question I am interested in. Assuming the bb in our test case is split in two bbs, such that the llvm.expect is in the first bb and its use is in the second. Does this also crash today? Is this test case also fixed by removing llvm.expect from the set of ignored intrinsics? Best, Best, |
Thank you sir :) Previously, when Polly was supposed to code-generate the llvm.expect instruction, it merely skipped it because it was part of the ignoredIntrinsic list. Then, when it encountered the trunc instruction, it was forced to code-generate llvm.expect using trySynthesizeValue(). So, for your first question, I do not think it harms to code generate llvm.expect in the first place. Second question: The error happens because, when the llvm.expect function is code generated through trySynthesizeValue() -> expandCodeFor() -> llvm::SCEVExpander::expand() it inserts the statement in a BB that is hoisted as far out as possible. The InsertPoint that is passed is not honored for some reason. I am exploring the last question about the BB split, I am trying to come up with such a test case. Will post any updates here. Thanks, |
What also is interesting. If llvm.expect can be hoised, why can the operands it depends on not also be hoisted? |
I think I may have found something interesting regarding the same. So I checked the modification of the insert point in llvm::SCEVExpander::expand(), and found that it doesn't change the insert point erroneously. However - File - polly / lib / Support / ScopHelper.cpp I traced the code generation to the expandCodeFor() in this struct. Polly's expandCodeFor is called using insert point: The basic block of insert point -
|
Hi Annanay, thanks for the thorough investigation and also the fruitful discussion. If you would like to get more information about the code snippet you asked above, it might be useful to ask Johannes, who committed this change, but unfortunately did not add comments: commit a4c1f49147b844584ada9b25bb7bd85d65819cb6
7308-0d34-0410-b5e6-96231b3b80d8 Best, |
Hi Annanay, any update here? |
Hi sir, I assumed that removing the BBMap insert statement in BlockGenerators.cpp::trySynthesizeNewValue() was the fix to this bug. But I will speak to Johannes about the Insertion point detection algorithm before coming to a consensus on this. Thanks, |
Sure. Now, we need to get this bug fixed. ;) The standard process would be to submit a patch to reviews.llvm.org, motivate the bug fix in the commit message, and possibly add some additional evidence, e.g., by running the LLVM test-suite and showing that it does not trigger any bugs. |
That's also a great moment to include Johannes, as you can reference his old commit in your patch summary and discuss its relation to this issue and how your solution addresses the problem you have seen |
Haha, right sir, I will communicate with Johannes to figure out why this fix works, in some depth. And then I will go ahead and submit the patch to Phabricator. Thanks, |
I would prefer if you could submit the patch to phabricator and then discuss on phabricator. It is best practice to have all development discussions in practice such that others can benefit from it. |
I mean in PUBLIC |
Sure sir. Will do so. |
Resolved in r303006. |
Extended Description
opt -polly-codegen /tmp/expect-causes-issues.ll -polly-process-unprofitable
WARNING: You're attempting to print out a bitcode file.
This is inadvisable as it may cause display problems. If
you REALLY want to taste LLVM bitcode first-hand, you
can force output with the `-f' option.
Instruction does not dominate all uses!
%p_tmp3 = icmp ult i64 4, 12
%0 = zext i1 %p_tmp3 to i64
The text was updated successfully, but these errors were encountered: