You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think critical edge splitting is doing the right thing here, and splitting the critical edges enables (early) tail duplication of the computed goto.
Now, in theory, that also seems like the right thing to do, since this allows the computed gotos to be better predicted, which is pretty much the point of the whole exercise. But it looks like actually performing the tail duplication is slow, and regalloc can't really handle the result well:
Passing -disable-early-taildup makes this go away, so regalloc is fine with the edges just being split, without duplication:
3.5280 ( 66.2%) 0.0000 ( 0.0%) 3.5280 ( 65.3%) 3.5313 ( 65.4%) Branch Probability Basic Block Placement
0.4120 ( 7.7%) 0.0040 ( 5.3%) 0.4160 ( 7.7%) 0.4154 ( 7.7%) Simple Register Coalescing
0.3080 ( 5.8%) 0.0120 ( 15.8%) 0.3200 ( 5.9%) 0.3196 ( 5.9%) Machine Block Frequency Analysis
I'm not really sure what we want to do here, but I'd say the thing that should be bailing early here is taildup, not edge splitting. Adding Kyle as the resident taildup expert.
Extended Description
The attached (unreduced) test case needs 1.6GB peak RAM and ~3min on my laptop after r296416, before it took 100MB and 1s.
The text was updated successfully, but these errors were encountered: