Bad code generation at "-O0" and "double" values #19518

llvmbot · 2014-03-14T22:17:26Z


Bugzilla Link	19144
Resolution	FIXED
Resolved on	Mar 18, 2014 09:33
Version	trunk
OS	Linux
Reporter	LLVM Bugzilla Contributor
CC	@echristo,@hfinkel

Extended Description

The following test code fails when compiled with -O0.

#include <stdarg.h>
#include <stdio.h>
#include <string.h>

int x = 1;
int q = 1;
double y = 7.0;

double foo() {
double d = x + y + q;
return d;
}

int main(void) {

double b = foo();
if (b != 9.0) {
printf("FAILED (result = %g but expect 9.0)\n", b);
} else {
printf("SUCCESS\n");
}
}

If "double" is replaced with "float" it works. -O1 and up have no problems.

echristo · 2014-03-14T22:30:19Z

Bill is the one who implemented fast-isel support for PPC. Bill?

llvmbot · 2014-03-14T22:40:33Z

Sounds like something simple with the int-to-float conversion code in fast-isel. I am heavily involved with a code delivery right now so it may be a week or two before I can look at this.

Question: Is this PPC32 or PPC64? Fast-isel should only be enabled for PPC64 as I haven't tested it for 32-bit, and I'm sure it's broken in a number of ways there. Just making sure that hasn't been turned on by accident.

llvmbot · 2014-03-14T23:04:55Z

This is on PPC64.

The function foo() is under 40-instructions in length.
So, I can paste that portion of a dissasembly into the bug report, or attach the .s if either would help.

-Paul (working w/ Nenad)

llvmbot · 2014-03-14T23:20:19Z

The function foo() is under 40-instructions in length.

I can do better than 40. Below are a simpler testcase and the corresponding assembly (now 12 instructions) for the problematic foo().

FWIW, the first instruction "li 3, 42" is not coincidental; changing the "42.0" in the C code makes a corresponding change in the asm. That is kind of suspicious to me.

-Paul

#include <stdio.h>
int x = 1;
double foo() { return x + 42.0; }
int main(void) {
double b = foo();
if (b != 43.0) {
printf("FAILED (result = %g but expect 43.0)\n", b);
} else {
printf("SUCCESS\n");
}
}

.LCPI0_0:
.quad 4631107791820423168
[...]
.L.foo:
li 3, 42
addis 4, 2, .LCPI0_0@toc@ha
lfd 0, .LCPI0_0@toc@l(4)
addis 4, 2, x@toc@ha
addi 4, 4, x@toc@l
lwa 4, 0(4)
std 4, -16(1)
lfd 1, -12(1)
fcfid 1, 1
fadd 1, 1, 0
std 3, -24(1)
blr

hfinkel · 2014-03-18T09:54:00Z

The function foo() is under 40-instructions in length.

I can do better than 40. Below are a simpler testcase and the corresponding
assembly (now 12 instructions) for the problematic foo().

FWIW, the first instruction "li 3, 42" is not coincidental; changing the
"42.0" in the C code makes a corresponding change in the asm. That is kind
of suspicious to me.

This is okay. Materializing 42.0 using the corresponding integer constant and then converting it to floating point is perfectly valid. One could argue that this is an 'optimization' that should not be performed when -O0 is requested, but that's another matter.

-Paul

#include <stdio.h>
int x = 1;
double foo() { return x + 42.0; }
int main(void) {
double b = foo();
if (b != 43.0) {
printf("FAILED (result = %g but expect 43.0)\n", b);
} else {
printf("SUCCESS\n");
}
}

.LCPI0_0:
.quad 4631107791820423168
[...]
.L.foo:
li 3, 42
addis 4, 2, .LCPI0_0@toc@ha
lfd 0, .LCPI0_0@toc@l(4)
addis 4, 2, x@toc@ha
addi 4, 4, x@toc@l
lwa 4, 0(4)
std 4, -16(1)
lfd 1, -12(1)

Now this is fishy. It looks like the offset on the floating-point load is calculated incorrectly. Given that both the std and the lfd deal with 8-byte data, and the lfd should be loading what the std just stored, the offset should be the same on both.

-Hal

    fcfid 1, 1
    fadd 1, 1, 0
    std 3, -24(1)
    blr

hfinkel · 2014-03-18T10:04:18Z

.LCPI0_0:
.quad 4631107791820423168
[...]
.L.foo:
li 3, 42
addis 4, 2, .LCPI0_0@toc@ha
lfd 0, .LCPI0_0@toc@l(4)
addis 4, 2, x@toc@ha
addi 4, 4, x@toc@l
lwa 4, 0(4)
std 4, -16(1)
lfd 1, -12(1)

Now this is fishy. It looks like the offset on the floating-point load is
calculated incorrectly. Given that both the std and the lfd deal with 8-byte
data, and the lfd should be loading what the std just stored, the offset
should be the same on both.

It looks like this may be the problematic bit:

in unsigned PPCFastISel::PPCMoveToFPReg we have:

unsigned LoadOpc = PPC::LFD;

if (SrcVT == MVT::i32) {
Addr.Offset = 4;
if (!IsSigned)
LoadOpc = PPC::LFIWZX;
else if (PPCSubTarget.hasLFIWAX())
LoadOpc = PPC::LFIWAX;
}

but if both (!IsSigned) and (PPCSubTarget.hasLFIWAX()) are false, then we're still loading using PPC::LFD, and we should not adjust the offset to 4.

-Hal

llvmbot · 2014-03-18T14:21:07Z

Agreed, that looks like the probable issue.

My other project took a turn for the better the last couple of days, so I am planning to look at this today. I need to update the fast-isel tests to include this case as well, which I apparently missed during the first go-round.

llvmbot · 2014-03-18T14:29:43Z

Ah, I see. The test case is written for -mcpu=pwr7, which is why this part didn't get tested. I'll add a -mcpu=pwr6 variant as part of the patch.

llvmbot · 2014-03-18T15:51:13Z

Hm. What processor are you compiling on? Looking at the code, I find it odd that we would generate an fcfid without generating a lfiwax.

We only generate any form of fcfid* in fast-isel if hasFPCVT() is true:

if (DstVT == MVT::f32 && !PPCSubTarget.hasFPCVT())
return false;

This feature is turned on for pwr7, a2, and a2q. But all of those already have the lfiwax feature enabled as well.

I'm suspicious that we might be falling back to DAG selection and there's a problem there as well.

llvmbot · 2014-03-18T15:58:51Z

Well, the last sentence is wrong, because fixing the address offset here does fix the problem, but I don't understand why we don't use lfiwax if we got this far. Will have to get out the debugger and see what's going on.

llvmbot · 2014-03-18T16:00:32Z

Oh, never mind, need more coffee. We're converting to double here, not float. Pish.

llvmbot · 2014-03-18T16:33:29Z

Fixed in r204155.

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 9, 2021

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bad code generation at "-O0" and "double" values #19518

Bad code generation at "-O0" and "double" values #19518

llvmbot commented Mar 14, 2014

echristo commented Mar 14, 2014

llvmbot commented Mar 14, 2014

llvmbot commented Mar 14, 2014

llvmbot commented Mar 14, 2014

hfinkel commented Mar 18, 2014

hfinkel commented Mar 18, 2014

llvmbot commented Mar 18, 2014

llvmbot commented Mar 18, 2014

llvmbot commented Mar 18, 2014

llvmbot commented Mar 18, 2014

llvmbot commented Mar 18, 2014

llvmbot commented Mar 18, 2014

Bad code generation at "-O0" and "double" values #19518

Bad code generation at "-O0" and "double" values #19518

Comments

llvmbot commented Mar 14, 2014

Extended Description

echristo commented Mar 14, 2014

llvmbot commented Mar 14, 2014

llvmbot commented Mar 14, 2014

llvmbot commented Mar 14, 2014

hfinkel commented Mar 18, 2014

hfinkel commented Mar 18, 2014

llvmbot commented Mar 18, 2014

llvmbot commented Mar 18, 2014

llvmbot commented Mar 18, 2014

llvmbot commented Mar 18, 2014

llvmbot commented Mar 18, 2014

llvmbot commented Mar 18, 2014