LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 33265 - Non-affine loops not code generated corretly
Summary: Non-affine loops not code generated corretly
Status: NEW
Alias: None
Product: Polly
Classification: Unclassified
Component: Optimizer (show other bugs)
Version: unspecified
Hardware: PC Linux
: P enhancement
Assignee: Polly Development Mailinglist
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-06-01 03:29 PDT by Tobias Grosser
Modified: 2021-09-12 14:38 PDT (History)
2 users (show)

See Also:
Fixed By Commit(s):


Attachments
Test case (1.64 KB, text/plain)
2017-06-01 03:29 PDT, Tobias Grosser
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tobias Grosser 2017-06-01 03:29:05 PDT
Created attachment 18556 [details]
Test case

The attached test case breaks as follows:

po /tmp/non-affine-loop-not-working.ll -polly-allow-nonaffine-loops -polly-process-unprofitable -polly-codegen

Instruction does not dominate all uses!
  %indvar = phi i64 [ %indvar.next, %bb9 ], [ 0, %bb2 ]
  %p_indvar.next = add i64 %indvar, 1
Instruction does not dominate all uses!
  %indvar = phi i64 [ %indvar.next, %bb9 ], [ 0, %bb2 ]
  %0 = add i64 %x, %indvar
LLVM ERROR: Broken function found, compilation aborted!

It seems non-affine loops are broken in this case. Unfortunately it seems we have no code-generation test coverage from the point when they have been introduced. :(

As they are disabled by default that is not breaking anything in the default configuration, but we should still fix this and add test cases to not regress further.
Comment 1 Tobias Grosser 2017-06-04 02:08:49 PDT
This seems to be broken since:

commit c397ad2f2a4badc0f028e1e60d991f9ddc09bec1
Author: Tobias Grosser <tobias@grosser.es>
Date:   Thu Jan 19 14:12:45 2017 +0000

    BlockGenerator: Do not redundantly reload from PHI-allocas in non-affine stmts
    
    Before this change we created an additional reload in the copy of the incoming
    block of a PHI node to reload the incoming value, even though the necessary
    value has already been made available by the normally generated scalar loads.
    In this change, we drop the code that generates this redundant reload and
    instead just reuse the scalar value already available.
    
    Besides making the generated code slightly cleaner, this change also makes sure
    that scalar loads go through the normal logic, which means they can be remapped
    (e.g. to array slots) and corresponding code is generated to load from the
    remapped location. Without this change, the original scalar load at the
    beginning of the non-affine region would have been remapped, but the redundant
    scalar load would continue to load from the old PHI slot location.
    
    It might be possible to further simplify the code in addOperandToPHI,
    but this would not only mean to pull out getNewValue, but to also change the
    insertion point update logic. As this did not work when trying it the first
    time, this change is likely not trivial. To not introduce bugs last minute, we
    postpone further simplications to a subsequent commit.
    
    We also document the current behavior a little bit better.
    
    Reviewed By: Meinersbur
    
    Differential Revision: https://reviews.llvm.org/D28892
    
    git-svn-id: https://llvm.org/svn/llvm-project/polly/trunk@292486

This bug can be worked around for this one test case with the following patch:

-    // Get the reloaded value.
-    OpCopy = getNewValue(Stmt, PHI, BBCopyMap, LTS, getLoopForStmt(Stmt));
+    MemoryAccess *Access = Stmt.getPHIAccessOrNULLFor(PHI);
+    if (Access) {
+      // Get the reloaded value.
+      OpCopy = getNewValue(Stmt, PHI, BBCopyMap, LTS, getLoopForStmt(Stmt));
+    } else {
+      // Get some global variables.
+      Value *Op = PHI->getIncomingValueForBlock(IncomingBB);
+      OpCopy = getNewValue(Stmt, Op, BBCopyMap, LTS, getLoopForStmt(Stmt));
+    }

but this patch does not work in general. The issue is that we expect the incoming values to be available at the index PHI in the BBMap, which is generally ensured through a PHI-READ. However, if 'Op' is synthesizable, now PHI-READ-ACCESS is added to the model, instead we try to synthesize PHI, which does not work as we still have references to the old IV in the synthesizable expressions. Ideas to address this:

1) Make the scalar evolution expression not synthesizable, by checking if any of its loop is part of a non-affine region.