Clang/LLVM currently miscompiles the following program: // gvnbug.c #include <stdio.h> int foo(); void test(int* gp1, int* gp2) { int g = 0; int a = 0, b = 0; int x = 7777, y = 6666; // also try swapping these int* p = &g; int* q = &g; int* r = &g; if (foo()) { a = 1; p = &y+1; q = &x; } *gp1 = (int)p+1; if (q == p) { b = 1; *gp2 = (int)q+1; r = q; } *r = 42; printf("a = %d, b = %d, x = %d\n", a, b, x); } int main() { int gp1 = 0; int gp2 = 0; test(&gp1, &gp2); return 0; } // aux.c int foo() { return 1; } $ clang-5.0 aux.c gvnbug.c -o gvnbug -O3 && ./gvnbug a = 1, b = 1, x = 7777 This result is not allowed. If a and b are both 1, the branch "q == p" must have been taken, so r was set to &x (via q), so x cannot be 7777. I think this issue has already come up in <https://bugs.llvm.org/show_bug.cgi?id=34548>, but so far there was no example showing that the bug arises independent of the incorrect inttoptr-simplification. What is happening here (if my analysis is correct) is that GVN sees the equality "q == p" and uses that to replace "q" by "p" in the then-branch. Next, LLVM notices that because p is derived from y, writing to r (which will either have value &g or p in the line where the assignment happens) cannot possibly affect x, and hence the initial value of x can be propagated into the printf. GVN is wrong to perform this kind of replacement; just because the bit representations of two pointers are equal, that doesn't mean that their provenance information is equal. Test case by Gil Hur.
This problem does not just affect clang; even safe Rust has miscompilations due to this: <https://github.com/rust-lang/rust/issues/45839>
TL;DR Given the current state of the world and LLVM memory model, it's unclear what is okay and not. Your claim that this result is not allowed is ... not clearly right :) IE i'd stick with language that says "we don't want this to be right". Nuno, et al are working on a memory model, and this isn't going to get fixed until then.
Ralf is part of the memory model team btw. We will share a document + patches starting late next week.
(In reply to Daniel Berlin from comment #2) > TL;DR Given the current state of the world and LLVM memory model, it's > unclear what is okay and not. > > Your claim that this result is not allowed is ... not clearly right :) > > IE i'd stick with language that says "we don't want this to be right". > > Nuno, et al are working on a memory model, and this isn't going to get fixed > until then. And it won't be fixed because ATM, you can show that propagating any equalities at all (pointer or not) is enough to break variants of this testcase. Equality propagation exists all over the compiler (SimplifyInstruction does it too, as does CVP, LVI, you name it). There are also non-obvious forms of equality propagation that would have the same effect, and be very hard to stop from happening. You'd have to be able to stop anything from determining equivalence in general, and that's really really hard. IE if (pi > 48 && pi < 50) if (yi > 48 && yi < 50) (pi == yi inside here) Besides the intractability of solving all of the above ATM, disabling all of it causes very significant performance loss.
(In reply to Nuno Lopes from comment #3) > Ralf is part of the memory model team btw. > We will share a document + patches starting late next week. Oh, nice. Great. Then i'll stop explaining it :)
> you can show that propagating any equalities at all (pointer or not) is enough to break variants of this testcase. AFAIK these other examples rely on the inttoptr-simplification, don't they? > Equality propagation exists all over the compiler (SimplifyInstruction does it too, as does CVP, LVI, you name it). I don't even know what half of that is, sorry -- but I get your point. :) > Your claim that this result is not allowed is ... not clearly right :) > > IE i'd stick with language that says "we don't want this to be right". Well, fair enough. Can we agree that either the compilation is wrong, or the above program should somehow trigger UB? In the latter case, that would mean that safe Rust programs can be UB, i.e., we'd need to figure out a way to statically ensure that this kind of stuff does not happen.
(In reply to Ralf Jung from comment #6) > > you can show that propagating any equalities at all (pointer or not) is enough to break variants of this testcase. > > AFAIK these other examples rely on the inttoptr-simplification, don't they? Yes, but mainly out of laziness (with on offense meant!) We can definitely construct examples where that isn't true, i think folks have just not bothered because i think everyone agreed there is a problem once we saw the inttoptr examples. My only concern, for example, is that i have a straightforward and sane way to know what to disallow and that it doesn't affect performance hugely. IE All i have to do is disallow deriving equality from comparisons between two pointer values.
Hi, I recently came across what seems to be at least a similar bug, while developing an application. I eventually managed to produce the following test case, which is miscompiling for me using llvm-8.0.0-x86_64-apple-darwin. // gvnbug.c #include <stdlib.h> #include <stdio.h> typedef struct MyStruct { void* pointer1; void* pointer2; } MyStruct; MyStruct entry(const int* pointer, int index) { MyStruct r; if (pointer[index] == 0) { r.pointer1 = 0; r.pointer2 = 0; } else { r.pointer1 = (void*)0x123; r.pointer2 = (void*)0x456; } return r; } int main() { int* pointer = (int*) calloc(1, sizeof(int)); int index = 0; while (1) { MyStruct et = entry(pointer, index); if (et.pointer1 == NULL) break; index++; if (index == pointer[0]) { index = 0; } } pointer[index] = 1; MyStruct r = entry(pointer, index); printf("%p %p\n", r.pointer1, r.pointer2); return 0; } $ clang -c -O1 -emit-llvm gvnbug.c -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk -stdlib=libc++ -o - | opt -gvn | lli 0x0 0x456 The expected output is 0x123 0x456. Omitting the `-gvn` optimization produces the expected output. Is this the same bug?
Hi Kosta, If I understand correctly, it is more like a bug of PRE incorrectly assuming that r.pointer1 is equivalent to et.pointer1 at the first iteration of the loop. After -O1, the bitcode looks like this: ``` pointer = calloc(1, 4) et = entry(pointer, 0) %1 = et.pointer1 if (%1 != null) { // Note that %1 is always null, so this is never taken. %2 = pointer[0] // always 0 do { p = phi(0, %select) inc = p + 1 select = (inc == %2) ? 0 : inc // always inc et2 = entry(pointer, select) } while(et2.pointer1 != null) } i2 = phi(0, select) pointer[i2] = 1 r = entry(pointer, i2) printf(.., r.pointer1, r.pointer2) ``` GVN-PRE replaces r.pointer1 with phi(et.pointer1, et2.pointer1), which is incorrect because `pointer[i2] = 1` makes `r` and `et` different. The incorrect replacement may have been fired due to the existence of the null comparison, but I guess it is not related with our bug (which replaces a pointer with another one with different origin).