Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive memory and CPU use in tail duplication and associated passes due to critical edge splitting #33015

Open
jsonn opened this issue Jul 1, 2017 · 4 comments
Labels
bugzilla Issues migrated from bugzilla llvm:codegen slow-compile

Comments

@jsonn
Copy link
Contributor

jsonn commented Jul 1, 2017

Bugzilla Link 33668
Version trunk
OS Linux
Blocks #33840
Attachments Compile for AMD64 with -O2
CC @zmodem,@hfinkel,@tstellar

Extended Description

The attached (unreduced) test case needs 1.6GB peak RAM and ~3min on my laptop after r296416, before it took 100MB and 1s.

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 6, 2017

I think critical edge splitting is doing the right thing here, and splitting the critical edges enables (early) tail duplication of the computed goto.

Now, in theory, that also seems like the right thing to do, since this allows the computed gotos to be better predicted, which is pretty much the point of the whole exercise. But it looks like actually performing the tail duplication is slow, and regalloc can't really handle the result well:

73.0840 ( 43.7%) 0.1280 ( 12.3%) 73.2120 ( 43.5%) 73.2136 ( 43.5%) Simple Register Coalescing
37.5720 ( 22.5%) 0.6480 ( 62.3%) 38.2200 ( 22.7%) 38.2208 ( 22.7%) Tail Duplication
32.3400 ( 19.4%) 0.1080 ( 10.4%) 32.4480 ( 19.3%) 32.4459 ( 19.3%) Eliminate PHI nodes for register allocation

Passing -disable-early-taildup makes this go away, so regalloc is fine with the edges just being split, without duplication:
3.5280 ( 66.2%) 0.0000 ( 0.0%) 3.5280 ( 65.3%) 3.5313 ( 65.4%) Branch Probability Basic Block Placement
0.4120 ( 7.7%) 0.0040 ( 5.3%) 0.4160 ( 7.7%) 0.4154 ( 7.7%) Simple Register Coalescing
0.3080 ( 5.8%) 0.0120 ( 15.8%) 0.3200 ( 5.9%) 0.3196 ( 5.9%) Machine Block Frequency Analysis

I'm not really sure what we want to do here, but I'd say the thing that should be bailing early here is taildup, not edge splitting. Adding Kyle as the resident taildup expert.

@zmodem
Copy link
Collaborator

zmodem commented Aug 23, 2017

Kyle, did you have a chance to look into this?

@zmodem
Copy link
Collaborator

zmodem commented Aug 29, 2017

Unblocking 5.0.0 as there seems to be no interest in fixing :-/

@tstellar
Copy link
Collaborator

mentioned in issue #33840

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla llvm:codegen slow-compile
Projects
None yet
Development

No branches or pull requests

4 participants