Created attachment 11756 [details] test.ll Description ------------------------------------------------------------------- The loop reroller causes an assertion failure for this loop when compiling for 32-bit targets: #define REAL float void daxpy_ur(int n,REAL da,REAL *dx,REAL *dy,int m) { for (int i = m; i < n; i = i + 4) { dy[i] = dy[i] + da*dx[i]; dy[i+1] = dy[i+1] + da*dx[i+1]; dy[i+2] = dy[i+2] + da*dx[i+2]; dy[i+3] = dy[i+3] + da*dx[i+3]; } } The assertion failure is: opt: include/llvm/Support/Casting.h:239: typename llvm::cast_retty<X, Y*>::ret_type llvm::cast(Y*) [with X = llvm::PHINode, Y = llvm::Value]: Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed. 0 opt 0x00000000013f4802 llvm::sys::PrintStackTrace(_IO_FILE*) + 34 1 opt 0x00000000013f5cea 2 libpthread.so.0 0x00007f315825b8f0 3 libc.so.6 0x00007f31570ffa75 gsignal + 53 4 libc.so.6 0x00007f31571035c0 abort + 384 5 libc.so.6 0x00007f31570f8941 __assert_fail + 241 6 opt 0x0000000000f0fabd 7 opt 0x0000000001145657 llvm::LPPassManager::runOnFunction(llvm::Function&) + 1287 8 opt 0x0000000001384b28 llvm::FPPassManager::runOnFunction(llvm::Function&) + 568 9 opt 0x0000000001384c0b llvm::FPPassManager::runOnModule(llvm::Module&) + 43 10 opt 0x000000000138463c llvm::legacy::PassManagerImpl::run(llvm::Module&) + 892 11 opt 0x000000000058ee9e main + 6110 12 libc.so.6 0x00007f31570eac4d __libc_start_main + 253 13 opt 0x0000000000580909 Stack dump: 0. Program arguments: opt -loop-reroll -S -debug-only=loop-reroll 1. Running pass 'Function Pass Manager' on module '<stdin>'. 2. Running pass 'Loop Pass Manager' on function '@daxpy_ur' 3. Running pass 'Reroll loops' on basic block '%for.body' Steps to reproduce ------------------------------------------------------------------- 1. Put the C program above in small.c 2. Compile $ clang -target arm-none-linux -mfloat-abi=soft small.c -S -o- -O1 -emit-llvm -o test.ll $ opt -loop-reroll -S -debug-only=loop-reroll < test.ll I also can reproduce the assertion using -target i386-none-linux. I've attached the bitcode to the bug. Analysis ------------------------------------------------------------------- The assertion comes from LoopReroll::reroll() function when expanding the the code for the new induction variable. The new induction varible is cast to a phi node, but the value produced by the scev expander is not actually a phi node. PHINode *NewIV = cast<PHINode>(Expander.expandCodeFor(H, IV->getType(), Header->begin())); The scev expander produces an induction variable that looks like this: %indvar = phi i32 [ %indvar.next, %for.body ], [ 0, %entry ] %i.055 = phi i32 [ %add27, %for.body ], [ %m, %entry ] %0 = add i32 %m, %indvar The '%0' value is what is returned from the scev expander. It uses a phi node, as an operand but is actually an add instruction. It looks like the cast to a phi-node is only needed to get the backedge value to use for the end-of-loop test. I made a quick change to use a comparison of "NewIV == (IterCount-1)" instead of "NextIV == IterCount". It works for this example, but it causes some make-check failures for loop-rerolling and I'm not sure it is the correct fix.
I investigated this a bit more, and I think the non-constant lower bound is causing the SCEV-expander to not end the code generation in a phi-node, but in the add instead. A simple loop like this also causes the assertion: void foo(int *A, int *B, int m, int n) { for (int i = m; i < n; i+=4) { A[i+0] = B[i+0] * 4; A[i+1] = B[i+1] * 4; A[i+2] = B[i+2] * 4; A[i+3] = B[i+3] * 4; } } We don't hit the assertion on 64-bit targets because for some reason the loop is not re-rolled. I posted a possible fix to llvm-commits.
(In reply to comment #1) > I investigated this a bit more, and I think the non-constant lower bound is > causing the SCEV-expander to not end the code generation in a phi-node, but > in the add instead. > > A simple loop like this also causes the assertion: > > void foo(int *A, int *B, int m, int n) { > for (int i = m; i < n; i+=4) { > A[i+0] = B[i+0] * 4; > A[i+1] = B[i+1] * 4; > A[i+2] = B[i+2] * 4; > A[i+3] = B[i+3] * 4; > } > } > > We don't hit the assertion on 64-bit targets because for some reason the > loop is not re-rolled. > > I posted a possible fix to llvm-commits. Thanks! I'll look at it.
r198425