[LoopUnroll] compile-time explosion with assumes at -O3 #48515

haoxintu · 2021-02-13T14:52:57Z


Bugzilla Link	49171
Resolution	FIXED
Resolved on	Feb 16, 2021 03:07
Version	trunk
OS	Linux
CC	@jdoerfert,@RKSimon,@nikic,@rotateright

Extended Description

Hi, all.

This code, s.c, is a valid program but makes Clang hung on with -O3 at compile time.

$cat s.c
int a, b, c, d;
void e() {
a = 7;
for (; a <= 78; a++) {
d = 3;
for (; d <= 73; d++) {
int f = 0;
b += c;
if (b) {
int g = 0;
for (f = 5; f; g)
;
}
}
}
}

$clang -c -w -O3 s.c
//endless compiling

This problem only occurs in clang-trunk with -O3, and other released versions or optimization flags seem well while dealing with this case.

$clang -v
clang version 13.0.0 (https://github.com/llvm/llvm-project 48bdd67)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/haoxin/haoxin-data/dut-research/compilers/llvm-project/build-20210127/bin
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.5.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8
Candidate multilib: .;@m64
Selected multilib: .;@m64

Thanks.
Haoxin

The text was updated successfully, but these errors were encountered:

rotateright · 2021-02-15T17:06:35Z

The bug seems to be in either LoopUnroll or ValueTracking's isKnownNonEqual() and has something to do with @llvm.assume() replication/analysis.

Also, this is not actually an infinite loop. If you adjust the for-loop constants down from 78/73, you can see the compile-time explosion in progress before it becomes intolerable:

time with "21, 16": user 0m3.467s
time with "22, 17": user 0m6.241s
time with "23, 18": user 0m16.959s

That's why the bug only becomes noticeable at -O3 rather than -O2. We use more aggressive unrolling parameters at O3.

cc'ing @nikic and @jdoerfert in case this might be related to https://reviews.llvm.org/D96208 ?

jdoerfert · 2021-02-15T17:33:53Z

We generate a total of #144 assumptions which should not be enough to cause a problem by itself.

From a quick look at this I see:
simplifyLoopAfterUnroll
will call
SimplifyICmpInst
which will call
computeKnownBits and computeKnownBitsFromAssume
a lot of times.

I imagine we look at a decent chunk of the 144 assumes for every if(b) branch in the unrolled loop, probably in a non-linear way?

nikic · 2021-02-15T17:40:03Z

IR test case:

@a = dso_local global i32 0, align 4
@c = dso_local global i32 0, align 4
@b = dso_local global i32 0, align 4

define dso_local void @e() local_unnamed_addr #0 {
entry:
store i32 7, i32* @a, align 4
%0 = load i32, i32* @c, align 4
%1 = mul i32 %0, 71
%b.promoted.pre = load i32, i32* @b, align 4
%add = sub i32 0, %0
%tobool.not.1 = icmp eq i32 %0, 0
br label %for.cond1.preheader

for.cond1.preheader: ; preds = %entry, %for.cond1.preheader
%b.promoted = phi i32 [ %b.promoted.pre, %entry ], [ %1, %for.cond1.preheader ]
%storemerge5 = phi i32 [ 7, %entry ], [ %inc10, %for.cond1.preheader ]
%tobool.not = icmp eq i32 %b.promoted, %add
call void @llvm.assume(i1 %tobool.not)
call void @llvm.assume(i1 %tobool.not.1)
store i32 %1, i32* @b, align 4
%inc10 = add nuw nsw i32 %storemerge5, 1
store i32 %inc10, i32* @a, align 4
%exitcond.not = icmp eq i32 %inc10, 79
br i1 %exitcond.not, label %for.end11, label %for.cond1.preheader

for.end11: ; preds = %for.cond1.preheader
ret void
}

declare void @llvm.assume(i1)

Run with -loop-unroll -unroll-threshold=300.

nikic · 2021-02-15T18:02:26Z

A bit smaller:

define void @test(i32 %arg) {
entry:
br label %loop

loop:
%iv = phi i32 [ 0, %entry ], [ %iv.inc, %loop ]
%iv2 = phi i32 [ 0, %entry ], [ %arg, %loop ]
%assume = icmp eq i32 %iv2, %arg
call void @llvm.assume(i1 %assume)
%iv.inc = add nuw nsw i32 %iv, 1
%exitcond = icmp eq i32 %iv.inc, 20 ; Increase me!
br i1 %exitcond, label %exit, label %loop

exit:
ret void
}

declare void @llvm.assume(i1)

I think that at least a large part of the problem is that we allow an unbounded scan to find non-dominating but guaranteed-to-execute assumes. We should limit that.

rotateright · 2021-02-15T18:21:10Z

I think that at least a large part of the problem is that we allow an
unbounded scan to find non-dominating but guaranteed-to-execute assumes. We
should limit that.

Yes, that seems to work. This makes the example compile fast:

diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 9e81f84b1c5f..912b57c5f2e0 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -551,8 +551,9 @@ bool llvm::isValidAssumeForContext(const Instruction *Inv,
// The context comes first, but they're both in the same block.
// Make sure there is nothing in between that might interrupt
// the control flow, not even CxtI itself.

unsigned ScanLimit = 10;
for (BasicBlock::const_iterator I(CxtI), IE(Inv); I != IE; ++I)

 if (!isGuaranteedToTransferExecutionToSuccessor(&*I))

 if (!isGuaranteedToTransferExecutionToSuccessor(&*I) || --ScanLimit == 0)
   return false;

return !isEphemeralValueOf(Inv, CxtI);

rotateright · 2021-02-15T18:43:04Z

Note that there's an instcombine test for that chunk of code added with:
https://reviews.llvm.org/D37215

I don't know if that test had any intent of checking the actual distance between assume and its context inst, but if we want to preserve it, we'd make the ScanLimit = 15 in my draft patch.

Let me know if I should post that on Phab and/or we want to do any compile-time-tracking experiments to find a better limit.

jdoerfert · 2021-02-15T19:14:11Z

+1 for distance checks

If we are looking at this, we should add early exit various places or to look at nounwind and willreturn attributes of the surrounding function. If present, no need to look at the instructions.

nikic · 2021-02-15T19:20:38Z

@spatel Your patch with a limit of 15 looks fine to me. I don't think we need to do further experiments, as this is a question of worst-case complexity rather than average-case performance.

rotateright · 2021-02-15T20:50:51Z

@spatel Your patch with a limit of 15 looks fine to me. I don't think we
need to do further experiments, as this is a question of worst-case
complexity rather than average-case performance.

Thanks! Committed the quick fix here - https://reviews.llvm.org/rG378941f611ab

I think that's enough to say this bug is fixed, but reopen if I got that wrong.

I haven't been following the details of guaranteed progress discussions, so not sure how to test this...does this correctly implement the suggested check for function attributes?

diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 493eb08bb9b0..ef78847cb43e 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -548,6 +548,12 @@ bool llvm::isValidAssumeForContext(const Instruction *Inv,
if (Inv == CxtI)
return false;

// Use function attributes to check if execution must reach the assume.
const Function *F = CxtI->getFunction();
if (F->hasFnAttribute(Attribute::WillReturn) ||

   F->hasFnAttribute(Attribute::NoUnwind))

 return !isEphemeralValueOf(Inv, CxtI);

// The context comes first, but they're both in the same block.
// Make sure there is nothing in between that might interrupt
// the control flow, not even CxtI itself.

haoxintu · 2021-02-16T11:07:06Z

Thanks for the quick fixing!

Cheers,
Haoxin

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LoopUnroll] compile-time explosion with assumes at -O3 #48515

[LoopUnroll] compile-time explosion with assumes at -O3 #48515

haoxintu commented Feb 13, 2021

rotateright commented Feb 15, 2021

jdoerfert commented Feb 15, 2021

nikic commented Feb 15, 2021

nikic commented Feb 15, 2021

rotateright commented Feb 15, 2021

rotateright commented Feb 15, 2021

jdoerfert commented Feb 15, 2021

nikic commented Feb 15, 2021

rotateright commented Feb 15, 2021

haoxintu commented Feb 16, 2021

[LoopUnroll] compile-time explosion with assumes at -O3 #48515

[LoopUnroll] compile-time explosion with assumes at -O3 #48515

Comments

haoxintu commented Feb 13, 2021

Extended Description

rotateright commented Feb 15, 2021

jdoerfert commented Feb 15, 2021

nikic commented Feb 15, 2021

nikic commented Feb 15, 2021

rotateright commented Feb 15, 2021

rotateright commented Feb 15, 2021

jdoerfert commented Feb 15, 2021

nikic commented Feb 15, 2021

rotateright commented Feb 15, 2021

haoxintu commented Feb 16, 2021