Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LoopUnroll] compile-time explosion with assumes at -O3 #48515

Closed
haoxintu opened this issue Feb 13, 2021 · 10 comments
Closed

[LoopUnroll] compile-time explosion with assumes at -O3 #48515

haoxintu opened this issue Feb 13, 2021 · 10 comments
Labels
bugzilla Issues migrated from bugzilla

Comments

@haoxintu
Copy link

Bugzilla Link 49171
Resolution FIXED
Resolved on Feb 16, 2021 03:07
Version trunk
OS Linux
CC @jdoerfert,@RKSimon,@nikic,@rotateright

Extended Description

Hi, all.

This code, s.c, is a valid program but makes Clang hung on with -O3 at compile time.

$cat s.c
int a, b, c, d;
void e() {
a = 7;
for (; a <= 78; a++) {
d = 3;
for (; d <= 73; d++) {
int f = 0;
b += c;
if (b) {
int g = 0;
for (f = 5; f; g)
;
}
}
}
}

$clang -c -w -O3 s.c
//endless compiling

This problem only occurs in clang-trunk with -O3, and other released versions or optimization flags seem well while dealing with this case.

$clang -v
clang version 13.0.0 (https://github.com/llvm/llvm-project 48bdd67)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/haoxin/haoxin-data/dut-research/compilers/llvm-project/build-20210127/bin
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.5.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8
Candidate multilib: .;@m64
Selected multilib: .;@m64

Thanks.
Haoxin

@rotateright
Copy link
Contributor

The bug seems to be in either LoopUnroll or ValueTracking's isKnownNonEqual() and has something to do with @​llvm.assume() replication/analysis.

Also, this is not actually an infinite loop. If you adjust the for-loop constants down from 78/73, you can see the compile-time explosion in progress before it becomes intolerable:

time with "21, 16": user 0m3.467s
time with "22, 17": user 0m6.241s
time with "23, 18": user 0m16.959s

That's why the bug only becomes noticeable at -O3 rather than -O2. We use more aggressive unrolling parameters at O3.

cc'ing @​nikic and @​jdoerfert in case this might be related to https://reviews.llvm.org/D96208 ?

@jdoerfert
Copy link
Member

We generate a total of #​144 assumptions which should not be enough to cause a problem by itself.

From a quick look at this I see:
simplifyLoopAfterUnroll
will call
SimplifyICmpInst
which will call
computeKnownBits and computeKnownBitsFromAssume
a lot of times.

I imagine we look at a decent chunk of the 144 assumes for every if(b) branch in the unrolled loop, probably in a non-linear way?

@nikic
Copy link
Contributor

nikic commented Feb 15, 2021

IR test case:

@​a = dso_local global i32 0, align 4
@​c = dso_local global i32 0, align 4
@​b = dso_local global i32 0, align 4

define dso_local void @​e() local_unnamed_addr #​0 {
entry:
store i32 7, i32* @​a, align 4
%0 = load i32, i32* @​c, align 4
%1 = mul i32 %0, 71
%b.promoted.pre = load i32, i32* @​b, align 4
%add = sub i32 0, %0
%tobool.not.1 = icmp eq i32 %0, 0
br label %for.cond1.preheader

for.cond1.preheader: ; preds = %entry, %for.cond1.preheader
%b.promoted = phi i32 [ %b.promoted.pre, %entry ], [ %1, %for.cond1.preheader ]
%storemerge5 = phi i32 [ 7, %entry ], [ %inc10, %for.cond1.preheader ]
%tobool.not = icmp eq i32 %b.promoted, %add
call void @​llvm.assume(i1 %tobool.not)
call void @​llvm.assume(i1 %tobool.not.1)
store i32 %1, i32* @​b, align 4
%inc10 = add nuw nsw i32 %storemerge5, 1
store i32 %inc10, i32* @​a, align 4
%exitcond.not = icmp eq i32 %inc10, 79
br i1 %exitcond.not, label %for.end11, label %for.cond1.preheader

for.end11: ; preds = %for.cond1.preheader
ret void
}

declare void @​llvm.assume(i1)

Run with -loop-unroll -unroll-threshold=300.

@nikic
Copy link
Contributor

nikic commented Feb 15, 2021

A bit smaller:

define void @​test(i32 %arg) {
entry:
br label %loop

loop:
%iv = phi i32 [ 0, %entry ], [ %iv.inc, %loop ]
%iv2 = phi i32 [ 0, %entry ], [ %arg, %loop ]
%assume = icmp eq i32 %iv2, %arg
call void @​llvm.assume(i1 %assume)
%iv.inc = add nuw nsw i32 %iv, 1
%exitcond = icmp eq i32 %iv.inc, 20 ; Increase me!
br i1 %exitcond, label %exit, label %loop

exit:
ret void
}

declare void @​llvm.assume(i1)

I think that at least a large part of the problem is that we allow an unbounded scan to find non-dominating but guaranteed-to-execute assumes. We should limit that.

@rotateright
Copy link
Contributor

I think that at least a large part of the problem is that we allow an
unbounded scan to find non-dominating but guaranteed-to-execute assumes. We
should limit that.

Yes, that seems to work. This makes the example compile fast:

diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 9e81f84b1c5f..912b57c5f2e0 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -551,8 +551,9 @@ bool llvm::isValidAssumeForContext(const Instruction *Inv,
// The context comes first, but they're both in the same block.
// Make sure there is nothing in between that might interrupt
// the control flow, not even CxtI itself.

  • unsigned ScanLimit = 10;
    for (BasicBlock::const_iterator I(CxtI), IE(Inv); I != IE; ++I)
  •  if (!isGuaranteedToTransferExecutionToSuccessor(&*I))
    
  •  if (!isGuaranteedToTransferExecutionToSuccessor(&*I) || --ScanLimit == 0)
       return false;
    

    return !isEphemeralValueOf(Inv, CxtI);

@rotateright
Copy link
Contributor

Note that there's an instcombine test for that chunk of code added with:
https://reviews.llvm.org/D37215

I don't know if that test had any intent of checking the actual distance between assume and its context inst, but if we want to preserve it, we'd make the ScanLimit = 15 in my draft patch.

Let me know if I should post that on Phab and/or we want to do any compile-time-tracking experiments to find a better limit.

@jdoerfert
Copy link
Member

+1 for distance checks

If we are looking at this, we should add early exit various places or to look at nounwind and willreturn attributes of the surrounding function. If present, no need to look at the instructions.

@nikic
Copy link
Contributor

nikic commented Feb 15, 2021

@​spatel Your patch with a limit of 15 looks fine to me. I don't think we need to do further experiments, as this is a question of worst-case complexity rather than average-case performance.

@rotateright
Copy link
Contributor

@​spatel Your patch with a limit of 15 looks fine to me. I don't think we
need to do further experiments, as this is a question of worst-case
complexity rather than average-case performance.

Thanks! Committed the quick fix here - https://reviews.llvm.org/rG378941f611ab

I think that's enough to say this bug is fixed, but reopen if I got that wrong.

I haven't been following the details of guaranteed progress discussions, so not sure how to test this...does this correctly implement the suggested check for function attributes?

diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 493eb08bb9b0..ef78847cb43e 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -548,6 +548,12 @@ bool llvm::isValidAssumeForContext(const Instruction *Inv,
if (Inv == CxtI)
return false;

  • // Use function attributes to check if execution must reach the assume.
  • const Function *F = CxtI->getFunction();
  • if (F->hasFnAttribute(Attribute::WillReturn) ||
  •    F->hasFnAttribute(Attribute::NoUnwind))
    
  •  return !isEphemeralValueOf(Inv, CxtI);
    
  • // The context comes first, but they're both in the same block.
    // Make sure there is nothing in between that might interrupt
    // the control flow, not even CxtI itself.

@haoxintu
Copy link
Author

Thanks for the quick fixing!

Cheers,
Haoxin

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

4 participants