You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you look at "testcase.llvm.s", you'll notice that, even though both Foo and Bar functions compute the
exact same thing, the Foo function, because it's a loop that's been unrolled, has an "implicit def" of R3
before the gmpz_clear call:
The Foo case would be fine, if R3 wasn't trashed in the calls, but it appears to be. Though my guess is
that it probably should be marked as "clobbered" across calls. The "mpz_clear" function's in a library
and looks like this:
It's pretty onerous. It looks like we are marking the R3 register as being clobbered. I now think it's some
strange loop weirdness. I'm not skilled enough with bugpoint to get it to widdle this down to the pass that
could be causing the problem.
Extended Description
There seems to be a problem with how code is generated for a call after a loop is unrolled. Consider
this C code (on a PowerBook G4 machine):
typedef unsigned long int mp_limb_t;
typedef struct {
int _mp_alloc;
int _mp_size;
mp_limb_t *_mp_d;
} __mpz_struct;
typedef __mpz_struct mpz_t[1];
typedef const __mpz_struct *mpz_srcptr;
typedef __mpz_struct *mpz_ptr;
void Foo(mpz_srcptr base) {
unsigned i;
mpz_t want;
__gmpz_init(want);
for (i = 0; i < 2; i++) {
__gmpz_mul(want, want, base);
}
__gmpz_clear(want);
}
void Bar(mpz_srcptr base) {
mpz_t want;
__gmpz_init(want);
__gmpz_mul(want, want, base);
__gmpz_mul(want, want, base);
__gmpz_clear(want);
}
If you look at "testcase.llvm.s", you'll notice that, even though both Foo and Bar functions compute the
exact same thing, the Foo function, because it's a loop that's been unrolled, has an "implicit def" of R3
before the gmpz_clear call:
while the Bar case has:
The Foo case would be fine, if R3 wasn't trashed in the calls, but it appears to be. Though my guess is
that it probably should be marked as "clobbered" across calls. The "mpz_clear" function's in a library
and looks like this:
void mpz_clear (mpz_ptr m) {
(*__gmp_free_func) (m->_mp_d, m->_mp_alloc * BYTES_PER_MP_LIMB);
}
Now, gcc produces code for Foo that looks like this:
L2:
addi r3,r1,56
mr r5,r29
mr r4,r3
bl L___gmpz_mul$stub
addic. r30,r30,-1
bne- cr0,L2
addi r3,r1,56
bl L___gmpz_clear$stub
and Bar that looks like this:
So it's doing the correct thing. We should too.
BTW, this is the only test which failed in the GMP testsuite. Woo!
-bw
The text was updated successfully, but these errors were encountered: