The attached file crashes in the correlated value propagation pass. LLVM trunk r261633. $ ~/build/llvm/debug/bin/opt z08_bugpoint.ll -verify -correlated-propagation -analyze Printing analysis 'Module Verifier' for function 'Manifest': Pass::print not implemented for pass: 'Module Verifier'! opt: /home/meinersbur/src/llvm/lib/Analysis/LazyValueInfo.cpp:766: bool {anonymous}::LazyValueInfoCache::solveBlockValueNonLocal({anonymous}::LVILatticeVal&, llvm::Value*, llvm::BasicBlock*): Assertion `isa<Argument>(Val) && "Unknown live-in to the entry block"' failed. 0 opt 0x0000000001dd2f80 llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 59 1 opt 0x0000000001dd32d8 2 opt 0x0000000001dd180f llvm::sys::RunSignalHandlers() + 153 3 opt 0x0000000001dd299c 4 libpthread.so.0 0x00007fcc16cf7d10 5 libc.so.6 0x00007fcc15e97267 gsignal + 55 6 libc.so.6 0x00007fcc15e98eca abort + 362 7 libc.so.6 0x00007fcc15e9003d 8 libc.so.6 0x00007fcc15e900f2 9 opt 0x000000000127373f 10 opt 0x0000000001272d67 11 opt 0x000000000127274c 12 opt 0x0000000001275efc 13 opt 0x00000000012768f8 llvm::LazyValueInfo::getConstantOnEdge(llvm::Value*, llvm::BasicBlock*, llvm::BasicBlock*, llvm::Instruction*) + 188 14 opt 0x0000000001b81a68 15 opt 0x0000000001b827d4 16 opt 0x0000000001848e82 llvm::FPPassManager::runOnFunction(llvm::Function&) + 330 17 opt 0x000000000184901b llvm::FPPassManager::runOnModule(llvm::Module&) + 133 18 opt 0x0000000001849396 19 opt 0x0000000001849aab llvm::legacy::PassManagerImpl::run(llvm::Module&) + 271 20 opt 0x0000000001849cb7 llvm::legacy::PassManager::run(llvm::Module&) + 39 21 opt 0x0000000000f01314 main + 7148 22 libc.so.6 0x00007fcc15e82a40 __libc_start_main + 240 23 opt 0x0000000000edbd79 _start + 41 Stack dump: 0. Program arguments: /home/meinersbur/build/llvm/debug/bin/opt z08_bugpoint.ll -verify -correlated-propagation -analyze 1. Running pass 'Function Pass Manager' on module 'z08_bugpoint.ll'. 2. Running pass 'Value Propagation' on function '@Manifest' Aborted (core dumped)
No file has been attached.
This is probably fall out from my recent LVI changes. If you can get me a reproducer, I'll investigate.
Created attachment 15939 [details] Bugpoint reduced sample file derived from test-suite/MultiSource/Benchmarks/MiBench/consumer-typeset/z08.c Sorry, here's the file. (Bugzilla sometimes seems to not upload the file when selecting a file)
This really looks like invalid IR. Here's the key piece: if.end1129: ; preds = %polly.stmt.if.then1058, %if.then1127, %if.then1058 %x.addr.0.ph.merge = phi %union.rec.15.147.241.373.462.517.674.764.930.984.1071.1229.1265.1300.1318.1336.1354.1425.1443.1549.1740.1791.1808* [ %x, %polly.stmt.if.then1058 ], [ %x, %if.then1058 ], [ null, %if.then1127 ] %x.addr.0.ph.merge16632 = phi %union.rec.15.147.241.373.462.517.674.764.930.984.1071.1229.1265.1300.1318.1336.1354.1425.1443.1549.1740.1791.1808* [ %x, %polly.stmt.if.then1058 ], [ %x.addr.0.ph.merge, %if.then1058 ], [ %x.addr.0.ph.merge, %if.then1127 ] unreachable The second phi's inputs are the first phi *in the same block*. This is leading us to do queries on %x.addr.0.ph.merge outside of it's defining block, but the root issue appears to be invalid IR. Looking at the verifier, I noticed that we're whitelisting instructions in the same basic block for dominance queries. This is likely fine for all non-phi instructions, but appears to be incorrect for phis. Michael, can you tweak the Verifier with the following diff and then try re-reducing? I suspect there is a real problem here, but the example bugpoint produced is invalid. - Assert(InstsInThisBlock.count(Op) || DT.dominates(Op, U), + Assert((!isa<PHINode>(I) && InstsInThisBlock.count(Op)) || + DT.dominates(Op, U), "Instruction does not dominate all uses!", Op, &I);
Er, I don't see why a phi cannot have an input from another phi in the same BB.
David, if that is legal, I suspect we need to audit large parts of the optimizer which assume otherwise. To be clear, the basic block is not part of a cycle in the CFG. The lang ref seems pretty clear about this: "For the purposes of the SSA form, the use of each incoming value is deemed to occur on the edge from the corresponding predecessor block to the current block (but after any definition of an ‘invoke‘ instruction’s return value on the same edge)." The PHI input is not available on the edge specified. It's only defined in the given basic block and there's no cycles in the CFG by which it can dominate the incoming edge. What am I missing?
Yeah, I don't think define void @f() { entry: br label %next next: %y = phi i32 [ 0, %entry ] %x = phi i32 [ %y, %entry ] ret void } should pass the verifier (which it does today).
(In reply to comment #6) > David, if that is legal, I suspect we need to audit large parts of the > optimizer which assume otherwise. To be clear, the basic block is not part > of a cycle in the CFG. > > The lang ref seems pretty clear about this: > "For the purposes of the SSA form, the use of each incoming value is deemed > to occur on the edge from the corresponding predecessor block to the current > block (but after any definition of an ‘invoke‘ instruction’s return value on > the same edge)." > > The PHI input is not available on the edge specified. It's only defined in > the given basic block and there's no cycles in the CFG by which it can > dominate the incoming edge. > > What am I missing? I've certainly forced myself to consider such cases when writing code which deals with a PHI's uses. FWIW, Transforms/LoopVectorize/phi-hang.ll in the tests directory appears to be checking for this exact case. I don't think it would be crazy to say that the langref might not be correct here. Perhaps I am wrong. It would be good to take this to the list.
Some thoughts on this: The debate seems to be between a "strict interpretation" of PHIs that requires a source to be dominated by the corresponding incoming edge, and a "loose interpretation" that allows a source to be a preceding PHI in the same block. Separate but related to the question of what the verifier allows is what are the semantics of PHIs. In particular, does the evaluation of a PHI source occur in the context of the predecessor edge/block, or does it occur in the context of the block holding the PHI? With the "strict interpretation", it seems clear-cut; evaluation happens in the context of the predecessor. With the "loose interpretation", it gets a little ambiguous (and I'd argue that the ambiguity itself is a worthwhile reason to prefer the "strict interpretation"). Consider this example: declare i32 @f() declare i1 @b(i32) define void @foo(i1 %b1) { entry: br i1 %b1, label %left, label %right left: %l = call i32 @f() br label %join right: %r = call i32 @f() br label %join join: %j = phi i32 [ %l, %left ], [ %r, %right ] %k = phi i32 [ 1, %left ], [ %j, %right ] br label %loop loop: %x = phi i32 [ 1, %join ], [ %z, %loop ] %y = phi i32 [ 1, %join ], [ %x, %loop ] %z = add i32 %x, 1 %b2 = call i1 @b(i32 %y) br i1 %b2, label %loop, label %exit exit: ret void } - In the case of the %r use on the %j phi, it must be evaluated in the context of the predecessor, because it doesn't dominate the block containing the PHI (or I suppose one could try to argue that it's ok to say it's evaluated in the context of the PHI block because you know you must be following a path where dynamically it was executed, but that seems rather convoluted) - In the case of the %j use on the %k phi (which I'd argue should be illegal, but in the "loose interpretation" I'm discussing here it's legal), it must be evaluated in the context of the block containing the PHI, because in the predecessor it isn't defined yet. - In the case of the %x use on the %y phi, things get more interesting. Consider the use of %y in the call to @b. In the first iteration, its value is 1. But then on the second iteration: + if we decide the %x use on the %y phi is evaluated in the context of the predecessor, the value of %y in the call to @b is 1 again (because that is the value %x had at the bottom of block %loop) + but if we decide the %x use on the %y phi is evaluated in the context of its block, the value of %y in the call to @b is now 2 (because that is the value %x has once its PHI executes on the second iteration) So the "loose interpretation" would force us to pick which semantics we infer for cases like the third (use of a previous phi in the same block, but tied to an edge that is a back-edge), and it will disagree with either the first type (use of a value dominating the pred but not the PHI block) or the second type (use of a previous PHI in the same block on a non-back edge). I'd rather not have to pick (by choosing the "strict interpretation", outlawing the second type, and having the first and third types consistent). Now, I don't know if this next part is controversial, but I think the "evaluated in the context of the predecessor" semantics are the best choice for the same-block back-edge cases. For one, that seems to be how lowering currently interprets PHIs; if I compile and run this program: ``` declare void @printf(i8*, ...) @format = internal constant [4 x i8] c"%d\0A\00" define i1 @foo(i32 %val) { call void (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @format, i32 0, i32 0), i32 %val) %b = icmp eq i32 %val, 1 ret i1 %b } define void @main() { entry: br label %loop loop: %x = phi i32 [ 1, %entry ], [ %z, %loop ] %y = phi i32 [ 1, %entry ], [ %x, %loop ] %z = add i32 %x, 1 %b = call i1 @foo(i32 %y) br i1 %b, label %loop, label %exit exit: ret void } ``` It prints 1 1 2 which indicates that the back-edge cases are evaluated in the context of the predecessor. For another thing, adopting the "evaluate in the context of the phi block" semantics would complicate reasonable transformations. Consider an example like this: entry: br label %loop loop: %x = phi i32 [ 1, %entry ], [ %z, %loop ] %y = phi i32 [ 1, %entry ], [ %w, %loop ] %z = add i32 %x, 1 %w = add i32 %x, 0 %b = call i1 @foo(i32 %y) br i1 %b, label %loop, label %exit Since %w is defined as "add i32 %x, 0", you'd generally like to be able to replace all uses of %w with %x, but if we take the "evaluate in the context of the phi block" semantics for back-edge PHI sources, doing this replacement would change the value %y receives around the back-edge. Third, it's my impression that "the literature" universally takes this view; two good references are - Benoit Boissinot, Alain Darte, Fabrice Rastello, Benoît Dupont de Dinechin, and Christophe Guillon. Revisiting Out-of-SSA Translation for Correctness, Code Quality, and Efficiency. In IEEE/ACM International Symposium on Code Generation and Optimization (CGO’09), pages 114–125. IEEE, 2009 (which very explicitly uses these semantics; the sources of the phis of a block are evaluated simultaneously in the predecessor, and the definitions written simultaneously upon entry to the phi block) and - Preston Briggs, Keith D. Cooper, Timothy J. Harvey, and L. Taylor Simpson. 1998. Practical improvements to the construction and destruction of static single assignment form. Softw. Pract. Exper. 28, 8 (July 1998), 859-881 (which as I understand it is the seminal paper on this corner of SSA, introducing the "lost copy problem" and "swap problem", the discussions of which imply these semantics) If those are compelling reasons to prefer the "evaluate in the context of the back-edge" semantics (as I think they are), then using the "loose interpretation" and allowing PHI sources to be preceding PHIs in the same block, unless I'm missing something, means that if you see label: %x = phi i32 [ 1, %a ], [ %z, %b ] %y = phi i32 [ 3, %a ], [ %x, %b ] and we take the edge from %b to %label, then the semantics are that %y receives the prior value of %x if that was a back-edge but receives the value of %z otherwise. That seems convoluted to me.
I regenerated the file in question and can confirm that it contains a phi using another phi in the same BB. The verifier with the suggestion change rejects it. I therefore think it does not make sense to bugpoint-reduce it again. Please notify me if you still think it is useful. It is the output of my own modification, hence not a direct bug in LLVM. However, I'd expect that the verifier reports it if it is not well-formed. Thanks to Joseph Tremoulet for his elaboration. I also think it makes more sense to reject such IR instead of making all passes handle it. It doesn't add any expressiveness because next: %y = phi i32 [ 0, %entry ] %x = phi i32 [ %y, %entry ] has the same meaning as, and can easily be converted to: next: %y = phi i32 [ 0, %entry ] %x = phi i32 [ 0, %entry ] (assuming that 'next' is not in a loop) Michael
Of course a phi can use another phi in the same block that is a self-loop. We cannot also allow phi def-use chains within a block. That's an obvious contradiction. Since those semantics would not be well defined, I can't see how a pass could handle this case and be correct in general. A block's phis are unordered, and passes need to be able to assume that. The verifier should fail on this IR. It's shocking that it doesn't.
"Fixed" as the verifier now rejects such IR since http://reviews.llvm.org/D18443 (r264528) During the review Sanjoy mentioned that InstsInThisBlock is just a performance optimization to quickly accept instructions that were just seen in the BasicBlock. This means it should behave the same as if only DT.dominates(Op, U) decides whether it the domination is correct, which rejects such IR.