-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bottom-up register-reduction scheduling actually increases register pressure #1447
Comments
I am going to take a closer look at his. But I don't see the problem you are *** Final schedule *** foo: Multiplies are intermixed with the adds. Dan, are you working with tot? |
Hmmm... I see there are some issues. Hopefully I can fix the general cases |
I used -march=x86-64 above; the code you posted looks like 32-bit code. Offhand Here's a case that shows the problem on both x86 and x86-64. In this test, the float %foo(float* %a) { |
Yeah, x86 and x86-64 cases scheduled differently because of calling convention I have fixed this. Doing some tests now. |
Fixed. |
Thanks Evan, please add a test for this. :) |
ah, you did, it's Regression/CodeGen/X86/2007-01-08-InstrSched.ll Thanks :) |
Extended Description
A high Sethi-Ullman number for an operation indicates it should be scheduled
earlier rather than later -- from a top-down perspective. However, in bottom-up
scheduling currently Seth-Ullman numbers are translated directly to priority
values, so such operations end up being scheduled earlier from a bottom-up
perspective. The result is that register pressure is actually increased instead
of reduced. The effect is somewhat hidden because the special case values are
also inverted -- for example, a store is given a low Sethi-Ullman number, but it
ends up having the desired effect. In larger test cases, there are fewer nodes
that use the special cases, and the effect is much more visible.
I've experimented a little with having the priority function take the inverse of
the SethiUllman value, and inverting the special-case values, and I've seen
spill counts drop significantly in large test cases on x86-64.
Here's a simple case that shows the problem:
float %foo(float %x) {
%tmp1 = mul float %x, 3.000000e+00
%tmp3 = mul float %x, 5.000000e+00
%tmp5 = mul float %x, 7.000000e+00
%tmp7 = mul float %x, 1.100000e+01
%tmp10 = add float %tmp1, %tmp3
%tmp12 = add float %tmp10, %tmp5
%tmp14 = add float %tmp12, %tmp7
ret float %tmp14
}
With -march=x86-64, the list-burr (default) schedule places all the multiplies
before all the adds. With the change mentioned above, the multiplies are
intermixed with the adds, so the register pressure is lower.
The text was updated successfully, but these errors were encountered: