New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Miscompile lencod (after Sanjoy's SCEV changes?) #28318
Comments
Hi Tobias, It look like it broke earlier, in http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-before-vectorizer-unprofitable/builds/3026/ and the only SCEV change in that build is 271151. I didn't notice this earlier because I thought this too was a failed build due to removing the Do you know why the test is failing? Is it a performance regression, or is the benchmark infinite looping? Finally, can you point me to some instructions on how I can reproduce this? |
Hi Sanjoy, you are right. This is really just the one commit. I tried to revert it locally, but I unfortunately already get conflicts. The command to try this locally should be: lnt runtest nt --sandbox /tmp --cc llvm_build/bin/clang-3.9 --test-suite test-suite/ --only-test MultiSource/Applications/JM/lencod --mllvm=-polly --mllvm=-polly-position=before-vectorizer I am currently traveling so I won't be able to debug this until next week. As the SCEV nsw stuff has many corner cases, it might very well be that this just triggers a Polly bug. So don't feel obliged to debug this trough. However, if you have some insights I would be glad to hear them. Best, |
*** Bug llvm/llvm-bugzilla-archive#27980 has been marked as a duplicate of this bug. *** |
@Johannes, this bug seems to be in code which you know best. Could you possibly have a look? OK. This is really a nasty bug, but I can reproduce it. I first tried to create a big .bc file and then just wanted to run bugpoint to reduce it. Unfortunately, this did not work for two reasons: 1) opt and llc/lli took minutes to just run this test case once. So everything took ages. 2) When specifying the individual passes one-by-one instead of using -O3, this bug disappears. :( So I had to revert to manually bisecting this bug. After a couple of hours debugging this is what I got:
File: block.c
Before Sanjoys patch we get:
After Sanjoys patch we get:
Without Load Hoisting we get:
How to reproduce the bug: Go to /home/grosser/Projects/polly/test-suite/MultiSource/Applications/JM/lencod and run: for i in |
Reduced test case I just managed to reduce this test case further. Interestingly, the new test case failed already before Sanjoy's patch. I see the following with latest Polly (274430): $opt test.ll -polly-scops -polly-process-unprofitable -polly-invariant-load-hoisting=false -analyze
... $opt test.ll -polly-scops -polly-process-unprofitable -analyze Even though Stmt_loop_next is always executed, it has an empty execution context when hoisted. This seems incorrect. It seems the reason is that the subscript expression of the hoisted memory access It seems we have a mismatch here. The invalid-context of the entire scop still allows the scop to be executed while we assume locally that some parts won't be executed. The following patch fixes this bug as well as the LNT bug we see in MultiSource/Applications/JM/lencod: --- a/lib/Analysis/ScopInfo.cpp
It restricts the invalid accesses to the memory accesses that are actually executed, which in this very case is enough to fix this issue as the only value of i0 that is every executed is i0 = 0, and the problematic values of i0 are larger. It would probably also make sense to collect the invalid context from the MemoryAccesses to ensure we won't get into a situation again where these invalid contexts are used, but not enforced by our run-time checks. Johannes, any thoughts on this? Best, |
I committed the fix proposed below in r275053. This addresses the most obvious issues and should make the buildbots green again. We may still need to review the load hoisting code more carefully. The fact that the run-time check we construct does not take into account the invalid-assumptions we derive for memory accesses looks rather fragile and would clearly benefit from some further review and testing. Unfortunately, Johannes is currently not available to help with these issues. |
mentioned in issue llvm/llvm-bugzilla-archive#27980 |
Extended Description
It seems that after Sanjoys SCEV changes that happend between 115336 and 115351 a miscompile was introduced in FAIL: lencod.execution_time
http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-before-vectorizer-unprofitable/builds/3034
To the the build-breakage that happened before it is not easily possible to track down the precise version.
The text was updated successfully, but these errors were encountered: