Get the example from PR1423 (the optimized bc file), run this command: llc attachment.cgi -o t.s -time-passes -f -fast You get an infinite recursion: #6132 0x0020e575 in llvm::RegScavenger::scavengeRegister () #6133 0x000356f5 in llvm::ARMRegisterInfo::eliminateFrameIndex () #6134 0x0020e575 in llvm::RegScavenger::scavengeRegister () #6135 0x000356f5 in llvm::ARMRegisterInfo::eliminateFrameIndex () #6136 0x0020e575 in llvm::RegScavenger::scavengeRegister () #6137 0x000356f5 in llvm::ARMRegisterInfo::eliminateFrameIndex () This seems to depend on -fast. -Chris
Does -fast disable coalescing? There might be an issue there if "dead" operands are not propagated.
yeah, could be.
Created attachment 968 [details] bugpoint-reduced-simplified.bc This bug can be reproduced without "-fast" using this testcase.
This bug occurs when the store in ScavengingFrameIndex has an offset greater than 2^12 (max ldr/str offset). On this case, the eliminateFrameIndex needs RegScavenger to emit the store generated by RegScavenger (infinite loop).
Created attachment 982 [details] RegScavenger.patch This patch makes easier to find this bug. The RegScavenger mustn't be used to eliminate the ScavengingFrameIndex. So, we shouldn't pass it to the eliminateFrameIndex function.
To solve, I think the ScavengingFrameIndex offset never should be greater than 2^12. For this, we should resort the stack: When using sp: --------------------- | callee saved reg | --------------------- | static alloca | --------------------- | spills | --------------------- when using fp: --------------------- | callee saved reg | --------------------- | spills | --------------------- | static alloca | --------------------- | dynamic alloca | --------------------- This solves the bug for the most cases. The bug would persist only for very rare cases (functions with more than ~1024 spills).
Fixed. http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070528/050110.html http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070528/050111.html The proposed idea was not implemented because the PrologEpilogInserter already has code to define the ScavengingFrameIndex offset. This bug was caused by a bug in this code.
This is the wrong fix. It's sweeping the bug under the rug. Even if disabling the use of RS when FP is used is correct (when I am not convinced of), that's up to the target to not instantiate RS when it detects that. PEI shouldn't do this check. Please revert the patch. I've not have a chance to look into this. Please explain, in this case, why is ScavengingFrameIndex not within the displacement limit? If there is a bug in the ARMRegisterInfo.cpp code that causes the miscalculation, then please fix it. If we ever run into situation where it's not possible to reserve a legal frame index, then we need a proper solution. Can we detect that before register allocation? If so, we need to detect that and reserve R12 and make sure RS isn't instantiated.
I'm not disabling the use of RS when FP is used. About 30 lines before my change, there is this code: if (RS && RegInfo->hasFP(Fn)) { int SFI = RS->getScavengingFrameIndex(); if (SFI >= 0) { ... The bug was: the PrologEpilogInserter has two chunks of code to define the ScavengingFrameIndex offset (one for when fp is used and other for when sp is used). The sp chunk of code was overwriting the calculation made by fp chunk of code.
Now I understand it! :-)