-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
InstCombine cannot blindly assume that inttoptr(ptrtoint x) -> x #33896
Comments
Personally I get "a=100 x=0" for this test case. Does the soundness issue go away if we stop optimizing 'n' to 1? That seems like a much better fix, IMO. Integer/pointer casts are far more important to optimize through than |
Reid, try this change: int a = 0, y[1], x = 0; |
Doesn't seem to repro here as well (x86-64_linux) [same happens with John's suggested change]. [davide@cupiditate bin]$ ./clang c.c b.c -O1 -o foo && ./foo |
Also [davide@cupiditate bin]$ ./clang c.c -O0 -S -emit-llvm -o - | ./opt -S -sroa > pre.ll Presumably this might have been fixed/hidden? Which revision are you at? Side note, I had to add <stdint.h> to the code to get it compile. |
Weird, I see "a=0 x=0" at -O2 using both 4.0 and 5.0 on OS X and Ubuntu 16.04 on x86-64. |
Can you verify this still happen on ToT ? In the meanwhile, I'm going to try with 5.0 on my machine (see if that applies, and bisect in case). |
I could not reproduce the problem either (same IR) |
Johns-MacBook-Pro-2:code regehr$ clang -v Johns-MacBook-Pro-2:code regehr$ clang -O2 alias4.c alias4-b.c Johns-MacBook-Pro-2:code regehr$ cat alias4.c void f(int *, int *); int main() { if (n) { if (n) { *(int *)pi = 15; printf("a=%d x=%d\n", a, x); f(&x, y); return 0; Johns-MacBook-Pro-2:code regehr$ cat alias4-b.c |
I don't think we do that? At least, I can't reproduce that with trunk clang... and it's pretty clearly not a legal transform. See https://reviews.llvm.org/rL249490 . |
Ok, so I guess some of you cannot reproduce this because your clang has a default option to enable some sanitization. $ ./bin/clang -cc1 -triple x86_64-pc-linux-gnu -emit-obj -O2 -o c.o c.c -internal-externc-isystem /usr/include/ -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-isystem /home/nuno/llvm-build/lib/clang/6.0.0/include $ ./bin/clang -o foo b.c c.o $ ./foo |
I was able to repro with John's reordering of the variables, but that doesn't seem that important. Why can't we assume inttoptr(ptrtoint(x)) is a no-op? What's wrong with that in principle? "a=0 x=0" seems like a real bug, but I'm not convinced this is the cause. |
Can someone attach a ll file and the opt commandline to reproduce the problem consistently? |
The problem really stems the following rule from http://llvm.org/docs/LangRef.html#pointer-aliasing-rules:
I think Nuno wrote up a good description somewhere, but essentially the problem is that we end up erasing dependencies, so alias analysis returns the wrong result. |
Is this the problem: inttoptr(ptrtoint(x)) + int_value where the int_value is actually a pointer but x is not. If the llvm folds the first, this becomes a getelementptr with x as the base, then the result value won't aliased with what int_value points to anymore? |
I think that this is essentially correct. The problem here is that inttoptr(ptrtoint(x)) only appears be based on x. However, in this case, the inttoptr is also control dependent on a comparison between some other pointer values. The resulting pointer, therefore, is based on all of these pointers. AA won't see this, just based on inttoptr(ptrtoint(x)) == x, because it will only see x. |
I can now reproduce the problem. An IR reproducible: llc good.ll > good.s ./a.out opt -early-cse-memssa -S < good.ll | llc > bad.s ./a.out opt -early-cse-memssa -S < good.ll > bad.ll The key diff between good.ll and bad.ll are < %add.ptr.x = select i1 %cmp, i32* %add.ptr, i32* %x
|
I suspect the answer is still yes because an answer of no is worse. So %add.ptr.x is defined as: %add.ptr = getelementptr inbounds [1 x i32], [1 x i32]* %y, i64 0, i64 1 And CSE is just applying the identity: q := (a != b ? b : a) => q := b and that certainly seems reasonable to me. FWIW, as currently defined, the icmp is currently well defined even for operands that point to different objects (we currently define the result in terms of the result of ptrtoint on each operand). |
That's precisely the problem though -- given what you said, the expression is actually "q := (ptrtoint(a) != ptrtoint(b) ? b : a)", which cannot be safely transformed to "q := b". |
I don't see the connection with ptrtoint/intptr. Whether the conditional assignment is optimized into q := b seems irrelevant. Suppose we have a local pointer analysis that can see through this and will know pi's value is y+1 (after the first select). I think the real problem is that the address y+1 is out of bounds of y, though there is no out of bounds access, the compiler can not assume that this address does not alias with other object, in this case 'x' (the second issue in CSE mentioned in comment #16) |
Indeed. Good point. We could say that this transformation is unsafe unless we know that both a and b have the same underlying object. |
Yes, we could define things this way. It would make our AA much less powerful (because we'd need to assume that any pointer we could not prove in bounds could potentially alias with all other pointers). We should find a different solution. |
But if ptrtoint(a) == ptrtoint(b) does not imply that a == b, that also seems unfortunate. I think, however, that I'd still prefer to restrict the inttoptr(ptrtoint(x)) folding to this. |
Re #21. Suppressing the optimization for this cases looks like paper overing the real problem. Regarding AA, I think in practice, the common case is that off-by-one address is used as sentinel value, so we only need to teach AA to handle this explicit out of bound case instead of making AA generally less conservative. Besides, we can teach AA to look at real references. If there are, we can assume OOB does not exist -- otherwise the behavior is UB. |
less conservative --> more conservative.
|
That's pretty fundamental, unfortunately. If we want to do high level AA inferences (like pointers derived off of two distinct malloc()'s do not alias) then we have be okay with ptrtoint(a) == ptrtoint(b) != (a == b).
I'm not sure if you meant "restricting inttoptr(ptrtoint(x))==x is more preferable" or the converse (the "however" threw me off :) ), but in any case, I think we have to restrict both. We can't assume ptrtoint(a) == ptrtoint(b) implies a == b for the reason stated above. And if we allow inttoptr(ptrtoint(x)) == x (but not "ptrtoint(a) == ptrtoint(b) implies a == b") then we have: i64 A = ... and we replace C with B (since we're dealing with integers here so our restriction does not apply), and the whole program was actually: // X and Y are 1 byte allocas where the address X+1 is the same as Y then the inttoptr(ptrtoint(x)) == x rule gives us: // X and Y are 1 byte allocas where the address X+1 is the same as Y which is an out of bounds access. |
I'm still not seeing how this causes a problem. If I have two pointers, a and b, and they don't alias, but ptrtoint(a) == ptrtoint(b), then they must have non-overlapping lifetimes. To have non-overlapping lifetimes, there must be some mutually-aliasing code in between. Note that our AA is only defined for pointers that are accessed. This is exactly why we can't use AA to reason about pointer comparisons. In fact, it may be the case that our AA queries are only well defined in general if some semantics-preserving transformation could put the two accesses next to each other. I've often thought this to be true, and perhaps this is why. |
The compiler does not create out of bound access.
I think we should not suppress optimizations, but instead fix the real underlying problems. |
agreed.
|
Okay, so, i admit to feeling confused why we've defined inttoptr and ptrtoint this way (IE to quantumly entangle other pointers, etc). I feel (probably wrongly), like the pointer aliasing definition should be something like "inttoptr only aliases the memory that is at the address of value passed to it". IE inttoptr(ptrtoint(x)) is a no-op from aliasing perspective. Whatever i can prove about the aliasing or non-alias of those objects is up to me. To prove anything about the aliasing result, i would need to take the pointer x, try to perform math on it, and see what the result is. Note also the result gcc 7 gives for the original example: bam. This has been like this as far back as i was lazy enough to try ;) ICC 2018 gives a correct result of a=100 x=0 |
It's probably not a good idea but I wanted to float the possibility of having the compiler look for cases where a one-past-the-end pointer might cause trouble and either add a bit of padding or else lay out the stack slots in a different order. |
The LLVM reference (except getelementptr inbounds) seem to miss concepts of valid pointer representation, pointer provenance or partial ordering of pointers. The ability of BasicAA to deduce pointers as non-aliasing based on provenance seem to be inferred from C/C++ semantics. From the semantics of icmp:
Sanjoy Das seem to have worked on non-integral pointer types which cannot be created "from thin air" by disallowing integer conversion instructions. |
The alias() API is defined for any pair of pointers which are both defined at some position in the program. The question is essentially, given two pointer values, for any location in the program where both pointer values are defined, is it possible that a store to one pointer would conflict with a load or store to the other pointer? The possible answers are essentially yes, no, or don't know. If it in't legal to dereference a pointer, it trivially doesn't alias anything: there are no memory accesses through the pointer, so there isn't any possible conflict. Similarly, if a pointer points to a constant, it doesn't alias anything: there are no stores through it.
icmp is defined to just take the raw pointer bits as an integer. This is described in LangRef. If the text of LangRef describing icmp isn't sufficiently clear, suggestions welcome. If some transform isn't consistent with this, please file a bug. For provenance, the rules we have are described at http://llvm.org/docs/LangRef.html#pointeraliasing . The problem described here is that the rules don't actually work. See also http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2311.pdf . |
Doesn't LLVM also assume that a fresh pointer returned by malloc does not equal any older pointer, and optimize icmp accordingly? They could physically be equal if addresses are reused (and that easily happens e.g. with jemalloc). C has a clause that pointers become undef when you deallocate the block they point to, but I have not seen such a clause in the LVLM semantics (but I might have just missed it). |
There's a class of optimizations where LLVM implicitly replaces malloc/new with an alloca, and we can optimize comparisons against that. In cases where that optimization isn't legal, LLVM doesn't make any assumptions, as far as I know. |
llvm/llvm-bugzilla-archive#46380 |
For those who have not seen this. WG14 now has a draft TS which addresses this topic: |
mentioned in issue #34577 |
mentioned in issue llvm/llvm-bugzilla-archive#37469 |
mentioned in issue #39193 |
mentioned in issue llvm/llvm-bugzilla-archive#46380 |
mentioned in issue llvm/llvm-bugzilla-archive#52570 |
FWIW: |
Would it make sense to have a |
What is the semantics of that operation? Why would a frontend not just emit |
Depending on your provenance model, |
You mean, depending on which choice LLVM makes for its provenance mode? LLVM IR is a language with its own semantics, decisions like the provenance model have to be made by the LLVM project (as they require adjusting the optimizations to be conformant). Frontends don't get to choose the LLVM provenance model. Frontend languages will have their own provenance model and they have to ensure that it can be mapped into LLVM's (similar to how e.g. frontends have to map their notion of 'uninitialized memory' to what LLVM does).
That sounds like |
Extended Description
Example of an end-to-end miscompilation by clang of the following code involving ptrtoint:
$ cat c.c
#include <stdio.h>
void f(int*, int*);
int main()
{
int a=0, y[1], x = 0;
uintptr_t pi = (uintptr_t) &x;
uintptr_t yi = (uintptr_t) (y+1);
uintptr_t n = pi != yi;
if (n) {
a = 100;
pi = yi;
}
if (n) {
a = 100;
pi = (uintptr_t) y;
}
*(int *)pi = 15;
printf("a=%d x=%d\n", a, x);
f(&x,y);
return 0;
}
$ cat b.c
void f(intx, inty) {}
$ clang -O2 c.c b.c -o foo
$ ./foo
a=0 x=0
This result is wrong. The two possible outcomes are: a=0 x=15, and a=100 x=0.
The bug is in Instcombine that treats inttoptr(ptrtoint(x)) == x, which is incorrect. This transformation can only be done if x is dereferenceable for the accesses through inttoptr.
Compare the following:
clang -O0 -S -emit-llvm -o - c.c | opt -S -sroa
clang -O0 -S -emit-llvm -o - c.c | opt -S -sroa -instcombine
Integer compares are replaces with pointer compares (wrong) and load/stores are changed from inttoptr to pointers directly (also wrong).
Test case by Gil Hur.
The text was updated successfully, but these errors were encountered: