1072 – Investigate elimination of the -raise pass

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 1072 - Investigate elimination of the -raise pass

Summary: Investigate elimination of the -raise pass

Status:	RESOLVED FIXED

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Scalar Optimizations (show other bugs)
Version:	trunk
Hardware:	Macintosh MacOS X

Importance:	P enhancement
Assignee:	Reid Spencer

URL:
Keywords:	code-cleanup

Depends on:
Blocks:

Reported:	2007-01-03 19:39 PST by Chris Lattner
Modified:	2010-02-22 12:47 PST (History)
CC List:	3 users (show)

See Also:
Fixed By Commit(s):

Attachments
External Nightly Test Results With -raise (3.78 KB, text/plain) 2007-02-02 00:27 PST, Reid Spencer	Details
External Nightly Test Results Without -raise (5.60 KB, text/plain) 2007-02-02 06:08 PST, Reid Spencer	Details
Performance Data Spreadsheet (44.00 KB, application/octet-stream) 2007-02-02 06:57 PST, Reid Spencer	Details
Performance Data Spreadsheet (32.50 KB, application/octet-stream) 2007-02-02 07:41 PST, Reid Spencer	Details
Olden Performance Comparison (16.50 KB, application/octet-stream) 2007-02-02 18:55 PST, Reid Spencer	Details
External Performance Comparison (Spreadsheet) (31.00 KB, application/octet-stream) 2007-02-03 09:32 PST, Reid Spencer	Details
255.vortex difference on stripped files (2.60 KB, text/plain) 2007-02-03 13:14 PST, Reid Spencer	Details
433.milc differences (640 bytes, text/plain) 2007-02-03 13:23 PST, Reid Spencer	Details
456.hmmer differences (31.70 KB, text/plain) 2007-02-03 13:30 PST, Reid Spencer	Details
447.dealII differences (60.72 KB, text/plain) 2007-02-03 13:49 PST, Reid Spencer	Details
Show Obsolete (3) Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Chris Lattner 2007-01-03 19:39:38 PST

The -raise pass was mainly around to eliminate casts from int -> uint etc through some tricky 
transformations.  With the recent signless work, this pass should be mostly useless.  Eliminating it is good 
as it reduces compile time (one fewer pass), eliminates a bunch of ugly code (the raise pass is very nasty), 
and potentially eliminating the LLVMTransforms library (ExprTypeConvert.cpp, LevelRaise.cpp, and 
TransformInternals.cpp).

We should do performance analysis (e.g. spec2k and 2k6) to see whether this is completely dead, and if 
not, reimplement the useful pieces in other passes.

-Chris

Comment 1 Chris Lattner 2007-01-05 20:16:54 PST

When this happens, we can eliminate Type::canLosslesslyBitCastTo and probably some other stuff only 
used by -raise.

-Chris

Comment 2 Reid Spencer 2007-02-01 13:42:24 PST

Is there anything preventing this from being eliminated now? 

Can we use the stats to determine how much it fires (if at all)?

What pieces does it implement that might still be useful to retain?

Comment 3 Chris Lattner 2007-02-01 13:45:00 PST

Nothing prevents it.  We just need someone to do an llvm-test run with an without it to see if there are 
any perf differences.

-Chris

Comment 4 Chris Lattner 2007-02-01 13:45:36 PST

or even just run SPEC2K + SPEC2K6.

-Chris

Comment 5 Devang Patel 2007-02-01 14:47:07 PST

I will set up overnight performance test run.

Comment 6 Reid Spencer 2007-02-01 21:12:09 PST

I want this gone. :)

Comment 7 Reid Spencer 2007-02-02 00:27:40 PST

Created attachment 610 [details]
External Nightly Test Results With -raise

Comment 8 Reid Spencer 2007-02-02 06:08:39 PST

Created attachment 611 [details]
External Nightly Test Results Without -raise

Comment 9 Reid Spencer 2007-02-02 06:57:15 PST

Created attachment 612 [details]
Performance Data Spreadsheet

This attachment contains a spreadsheet in Excel format that contains the raw
data and a comparison of the External tests with and without the -raise pass.

Comment 10 Reid Spencer 2007-02-02 07:40:28 PST

Summary Of Performance Results (See the Spreadsheet)

Four SPEC test suites were compared: CINT2000, CFP2000, CINT2006, and CFP2006.

In general it looks positive to remove the -raise pass. The overall results 
(sum of all tests) are as follows:

                            +--Compile-+   +----------Run Times--------+
ITEM       GCCAS   Bytecode LLC    JIT     GCC     CBE     LLC     JIT
Delta      151.05  818298   -9.07  16.2    14.66   20.81   46.40   57.15
Percent     6.49%   1.95%    -1.09% 3.09%   2.18%   3.14%   7.12%   4.04%

The "Delta" values shown are the difference in aggregate time between "with
-raise" and "without -raise" runs across all test cases.  Negative values mean
the "without -raise" value was larger. Positive values mean the "without -raise"
value was smaller. Because these values represent the subtraction of the
"without -raise" from the "with -raise" data sets, positive values favor removal
of -raise while negative values favor keeping -raise.

The "Percent" values are computed by dividing the "Delta" value by the "with
-raise" sum. This gives an approximation of the magnitude of change introduced
by removing the -raise pass.


To interpret this correctly, one would say: "On average, the -raise pass
increases bytecode volumes by nearly 2% and LLC runtimes by over 7%". 

The only negative impact of removing -raise seems to be a 1.09% increase in the
LLC compilation time. This is rather surprising given that there is less work
done by llc without -raise. 

Note that these results account for only one run of the nightly tests.
Additional samples would be welcome.

Comment 11 Reid Spencer 2007-02-02 07:41:58 PST

Created attachment 613 [details]
Performance Data Spreadsheet

This updates some formatting and gets rid of columns that weren't run in both
tests and rows that indicate test failures.

Comment 12 Reid Spencer 2007-02-02 08:10:21 PST

Chris,

Please let me know if this is sufficient evidence to support removal of the
-raise pass.

Reid

Comment 13 Chris Lattner 2007-02-02 12:20:59 PST

Yes, it looks like we are close to it.  Things worth investigating:

* What does -raise do on 401.bzip2 that shrinks the bc file 18K?  Once understood, we can reimplement 
this in some other pass.

* I have little confidence in the stabilities of these numbers, they seem very noisy.  Can you see if the 
254.gap/LLC, 254.gap/CBE, 255.vortex/CBE, and 256.bzip2/cbe slowdowns reproduce?  If so, we should 
investigate what is happening there too.

-Chris

Comment 14 Reid Spencer 2007-02-02 17:09:41 PST

> * What does -raise do on 401.bzip2 that shrinks the bc file 18K?  Once 
> understood, we can reimplement this in some other pass.

I ran 401.bzip2 using "head" from today and compared it against the "without
-raise" run I did last night. The bytecode sizes are identical. Consequently, I
don't think I was comparing apples with apples. I used the results of my nightly
run from a couple days ago for the "with -raise" data but my "without -raise"
also included the shift patch, which may have affected optimization (positively).

So, guess I get to do this again.

Devang, did you run this comparison on Darwin?


> * I have little confidence in the stabilities of these numbers, they seem very
>  noisy.

I eliminated as much interference workload as I could by shutting down email,
xchat, etc. and just let it run. I think the issue here is the difference in the
builds. I'll run the comparison again and report new numbers. Unfortunately,
it'll have to wait until this evening for "quiet machine".

> Can you see if the 254.gap/LLC, 254.gap/CBE, 255.vortex/CBE, and 
> 256.bzip2/cbe slowdowns reproduce?  If so, we should investigate what is
> happening there too.

I'll defer this until after I get new numbers. If they still look problematic,
I'll investigate then.

Comment 15 Chris Lattner 2007-02-02 17:12:23 PST

Makes sense.  A lot has changed in a few days.  I suspect the 'noise' is really measuring these changes :)

-Chris

Comment 16 Reid Spencer 2007-02-02 17:30:45 PST

Yes. I now have two environments that were updated at the same moment. The only
difference is that one has -raise and one doesn't. This should be good for
testing. I'm going to do Olden with GET_STABLE_NUMBERS=1 because its relatively
quick and a good test for stability. If that goes okay, I'll get to External
tests this weekend (7+ hours each).

Comment 17 Chris Lattner 2007-02-02 17:32:51 PST

sounds great, thanks reid!

Comment 18 Reid Spencer 2007-02-02 18:55:38 PST

Created attachment 614 [details]
Olden Performance Comparison

This looks much more reasonable. The difference in the runs is in the small
"noise" range resulting from me using the computer while the test is ongoing.
These were done single threaded and I'll do the same as for the Externals.

Interesting note: there is 0 difference in bytecode sizes. Perhaps -raise
doesn't do anything at all on Olden?

Comment 19 Chris Lattner 2007-02-02 18:57:13 PST

> there is 0 difference in bytecode sizes. Perhaps -raise
> doesn't do anything at all on Olden? 

Cool.  You could diff them to verify.

This is a good sign -raise isn't doing anything any more. :)  SPEC will give more certainty though.

Comment 20 Reid Spencer 2007-02-02 19:01:38 PST

[reid@bashful llvm-test-1]$ for f in
MultiSource/Benchmarks/Olden/*/Output/*.llvm.bc ; do
> cmp $f ../llvm-test-4/$f
> done
[reid@bashful llvm-test-1]$ echo $?
0

Looks pretty identical to me.

Comment 21 Reid Spencer 2007-02-03 09:32:04 PST

Created attachment 615 [details]
External Performance Comparison (Spreadsheet)

A comparison of the SPEC tests with and without -raise.

Comment 22 Reid Spencer 2007-02-03 09:53:10 PST

Looking at the External test comparison, it doesn't look like -raise does much. 

However a few things could be investigated:

255.vortex - Why does raise eliminate 733 bytes of bytecode?
433.milc   - What does raise do to make llc 1.9 seconds faster?
456.hmmer  - What does raise do to make llc 1.54 seconds fasters?
447.dealII - Why is "gccas" (opt) 2.93 seconds faster with -raise?
400.perlbench - What does raise do to make jit 1.04 seconds faster, especially 
                with 1133 more bytecode?

Comment 23 Reid Spencer 2007-02-03 09:54:38 PST

We should generate comparisons on Darwin/PPC as well just to make sure that
-raise doesn't assist any backend opts on that platform.  I don't see why it
would, but we should check.

Comment 24 Reid Spencer 2007-02-03 13:13:07 PST

For 255.vortex, here's what I found:

1. The bytecode size difference is due to the -raise pass removing or 
   altering the names of values. A diff of the llvm assembly corresponding to
   the .llvm.bc files from both cases produced 6000 lines of diffs. But, the
   only way they differed was in the names of values. I ran opt -strip on these
   files and re-ran the diff. It was vastly shorter. I'll attach it.

2. The diff produces in #1 showed that -raise will fold a GEP instruction into
   a store instruction using a GEP constant expression. Without -raise this 
   doesn't happen. There is also some changes in the # of uses probably related
   to the GEP folding into the store.

Its not clear why -instcombine wouldn't do the same transform as #2.

Comment 25 Reid Spencer 2007-02-03 13:14:32 PST

Created attachment 616 [details]
255.vortex difference on stripped files

This file contains the llvm assembly differences for 255.vortex.llvm.bc after 
opt -strip has been run on them.

Comment 26 Reid Spencer 2007-02-03 13:23:38 PST

Created attachment 617 [details]
433.milc differences

For 433.milc the only difference is that -raise moved a GEP instruction down
one instruction in its basic block. Not sure how that can produce a 1.9 second
difference in execution time.

Comment 27 Reid Spencer 2007-02-03 13:30:20 PST

Created attachment 618 [details]
456.hmmer differences

There are numerous differences on this test case. More analysis will be needed.

Comment 28 Reid Spencer 2007-02-03 13:49:29 PST

Created attachment 619 [details]
447.dealII differences

Again, there's a lot of differences in this one. Many of them seem to be the
relocation of gep instructions.

Comment 29 Reid Spencer 2007-02-03 13:57:34 PST

For 400.perlbench:

Unfortunately, my machine doesn't have enough memory to do a diff of two 2.7GB
LLVM assembly files.

Comment 30 Chris Lattner 2007-02-03 16:30:56 PST

the 255.vortex differences aren't worth worrying about, we're good there.

Comment 31 Chris Lattner 2007-02-03 17:07:08 PST

ok, go ahead and nuke it.

Comment 32 Reid Spencer 2007-02-03 17:19:15 PST

The -raise pass was remvoed with these patches:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070129/043817.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070129/043818.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070129/043819.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070129/043820.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070129/043821.html

Resolved.