Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LICM sinking a single unused load out of a loop prevents DSE in some other function!?! #19147

Closed
chandlerc opened this issue Feb 8, 2014 · 2 comments
Labels
bugzilla Issues migrated from bugzilla loopoptim

Comments

@chandlerc
Copy link
Member

Bugzilla Link 18773
Resolution FIXED
Resolved on Feb 10, 2014 16:39
Version trunk
OS Linux
Attachments reduced test case, Narrower test case
CC @hfinkel

Extended Description

This is crazy weird. The essential problem is that LICM got slightly more powerful in r200067 and started sinking loads out of the loop body in a corner case. When it does so, we mysteriously stop being able to DSE stores in a separate function!

The code for this is heavily reduced from Adobe-C++/loop_unroll.cpp. To reproduce the weird behavior, take an "opt" binary from trunk and "old opt" from r200066, and compare:

% opt < loop_unroll.reduced.ll -std-link-opts | llc -O3 -o loop_unroll.new.s
% clang++ -lm -o loop_unroll.new loop_unroll.new.s

vs.

% old_opt < loop_unroll.reduced.ll -std-link-opts | llc -O3 -o loop_unroll.old.s
% clang++ -lm -o loop_unroll.old loop_unroll.old.s

When I benchmark these on my sandybridge machine I get:

% perf stat -r5 ./loop_unroll.new && perf stat -r5 ./loop_unroll.old

Performance counter stats for './loop_unroll.new' (5 runs):

   1376.720033 task-clock                #    0.998 CPUs utilized            ( +-  0.10% )
             1 context-switches          #    0.001 K/sec                    ( +- 31.62% )
             0 cpu-migrations            #    0.000 K/sec                    ( +- 61.24% )
           152 page-faults               #    0.111 K/sec                    ( +-  0.16% )
 5,208,493,602 cycles                    #    3.783 GHz                      ( +-  0.02% )
 2,657,152,524 stalled-cycles-frontend   #   51.02% frontend cycles idle     ( +-  0.04% )
   279,081,881 stalled-cycles-backend    #    5.36% backend  cycles idle     ( +-  0.34% )
 8,515,758,351 instructions              #    1.63  insns per cycle        
                                         #    0.31  stalled cycles per insn  ( +-  0.00% )
   356,723,894 branches                  #  259.111 M/sec                    ( +-  0.00% )
       201,373 branch-misses             #    0.06% of all branches          ( +-  0.01% )

   1.378844222 seconds time elapsed                                          ( +-  0.10% )

Performance counter stats for './loop_unroll.old' (5 runs):

    877.683533 task-clock                #    0.998 CPUs utilized            ( +-  0.09% )
             1 context-switches          #    0.001 K/sec                    ( +- 40.82% )
             0 cpu-migrations            #    0.000 K/sec                    ( +-100.00% )
           152 page-faults               #    0.174 K/sec                    ( +-  0.16% )
 3,320,502,190 cycles                    #    3.783 GHz                      ( +-  0.00% )
    11,331,992 stalled-cycles-frontend   #    0.34% frontend cycles idle     ( +-  2.39% )
    35,003,292 stalled-cycles-backend    #    1.05% backend  cycles idle     ( +-  1.14% )
 6,978,870,054 instructions              #    2.10  insns per cycle        
                                         #    0.01  stalled cycles per insn  ( +-  0.00% )
   356,371,371 branches                  #  406.036 M/sec                    ( +-  0.00% )
       200,790 branch-misses             #    0.06% of all branches          ( +-  0.04% )

   0.879302485 seconds time elapsed                                          ( +-  0.09% )

Note the 50% stalled cycles on the new one!!!

I'm working on getting two A/B inputs to trunk 'opt' that exhibit the behavior, lacking any good ideas about why its actually happening.

Note that I've checked -- top of tree and r200067 behave exactly the same. The change is only in the patch committed with r200067.

@chandlerc
Copy link
Member Author

I have this diagnosed fully and am working on a fix... =[ The root cause is rather horrid.

@chandlerc
Copy link
Member Author

This should be fixed in r201104. There may be more demons lurking like this, but new PRs for them.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 9, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla loopoptim
Projects
None yet
Development

No branches or pull requests

1 participant