Created attachment 18556 [details] Test case The attached test case breaks as follows: po /tmp/non-affine-loop-not-working.ll -polly-allow-nonaffine-loops -polly-process-unprofitable -polly-codegen Instruction does not dominate all uses! %indvar = phi i64 [ %indvar.next, %bb9 ], [ 0, %bb2 ] %p_indvar.next = add i64 %indvar, 1 Instruction does not dominate all uses! %indvar = phi i64 [ %indvar.next, %bb9 ], [ 0, %bb2 ] %0 = add i64 %x, %indvar LLVM ERROR: Broken function found, compilation aborted! It seems non-affine loops are broken in this case. Unfortunately it seems we have no code-generation test coverage from the point when they have been introduced. :( As they are disabled by default that is not breaking anything in the default configuration, but we should still fix this and add test cases to not regress further.
This seems to be broken since: commit c397ad2f2a4badc0f028e1e60d991f9ddc09bec1 Author: Tobias Grosser <tobias@grosser.es> Date: Thu Jan 19 14:12:45 2017 +0000 BlockGenerator: Do not redundantly reload from PHI-allocas in non-affine stmts Before this change we created an additional reload in the copy of the incoming block of a PHI node to reload the incoming value, even though the necessary value has already been made available by the normally generated scalar loads. In this change, we drop the code that generates this redundant reload and instead just reuse the scalar value already available. Besides making the generated code slightly cleaner, this change also makes sure that scalar loads go through the normal logic, which means they can be remapped (e.g. to array slots) and corresponding code is generated to load from the remapped location. Without this change, the original scalar load at the beginning of the non-affine region would have been remapped, but the redundant scalar load would continue to load from the old PHI slot location. It might be possible to further simplify the code in addOperandToPHI, but this would not only mean to pull out getNewValue, but to also change the insertion point update logic. As this did not work when trying it the first time, this change is likely not trivial. To not introduce bugs last minute, we postpone further simplications to a subsequent commit. We also document the current behavior a little bit better. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D28892 git-svn-id: https://llvm.org/svn/llvm-project/polly/trunk@292486 This bug can be worked around for this one test case with the following patch: - // Get the reloaded value. - OpCopy = getNewValue(Stmt, PHI, BBCopyMap, LTS, getLoopForStmt(Stmt)); + MemoryAccess *Access = Stmt.getPHIAccessOrNULLFor(PHI); + if (Access) { + // Get the reloaded value. + OpCopy = getNewValue(Stmt, PHI, BBCopyMap, LTS, getLoopForStmt(Stmt)); + } else { + // Get some global variables. + Value *Op = PHI->getIncomingValueForBlock(IncomingBB); + OpCopy = getNewValue(Stmt, Op, BBCopyMap, LTS, getLoopForStmt(Stmt)); + } but this patch does not work in general. The issue is that we expect the incoming values to be available at the index PHI in the BBMap, which is generally ensured through a PHI-READ. However, if 'Op' is synthesizable, now PHI-READ-ACCESS is added to the model, instead we try to synthesize PHI, which does not work as we still have references to the old IV in the synthesizable expressions. Ideas to address this: 1) Make the scalar evolution expression not synthesizable, by checking if any of its loop is part of a non-affine region.