New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
InstCombine cannot blindly assume that inttoptr(ptrtoint x) -> x #33896
Comments
Personally I get "a=100 x=0" for this test case. Does the soundness issue go away if we stop optimizing 'n' to 1? That seems like a much better fix, IMO. Integer/pointer casts are far more important to optimize through than |
Reid, try this change: int a = 0, y[1], x = 0; |
Doesn't seem to repro here as well (x86-64_linux) [same happens with John's suggested change]. [davide@cupiditate bin]$ ./clang c.c b.c -O1 -o foo && ./foo |
Also [davide@cupiditate bin]$ ./clang c.c -O0 -S -emit-llvm -o - | ./opt -S -sroa > pre.ll Presumably this might have been fixed/hidden? Which revision are you at? Side note, I had to add <stdint.h> to the code to get it compile. |
Weird, I see "a=0 x=0" at -O2 using both 4.0 and 5.0 on OS X and Ubuntu 16.04 on x86-64. |
Can you verify this still happen on ToT ? In the meanwhile, I'm going to try with 5.0 on my machine (see if that applies, and bisect in case). |
I could not reproduce the problem either (same IR) |
Johns-MacBook-Pro-2:code regehr$ clang -v Johns-MacBook-Pro-2:code regehr$ clang -O2 alias4.c alias4-b.c Johns-MacBook-Pro-2:code regehr$ cat alias4.c void f(int *, int *); int main() { if (n) { if (n) { *(int *)pi = 15; printf("a=%d x=%d\n", a, x); f(&x, y); return 0; Johns-MacBook-Pro-2:code regehr$ cat alias4-b.c |
I don't think we do that? At least, I can't reproduce that with trunk clang... and it's pretty clearly not a legal transform. See https://reviews.llvm.org/rL249490 . |
Ok, so I guess some of you cannot reproduce this because your clang has a default option to enable some sanitization. $ ./bin/clang -cc1 -triple x86_64-pc-linux-gnu -emit-obj -O2 -o c.o c.c -internal-externc-isystem /usr/include/ -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-isystem /home/nuno/llvm-build/lib/clang/6.0.0/include $ ./bin/clang -o foo b.c c.o $ ./foo |
I was able to repro with John's reordering of the variables, but that doesn't seem that important. Why can't we assume inttoptr(ptrtoint(x)) is a no-op? What's wrong with that in principle? "a=0 x=0" seems like a real bug, but I'm not convinced this is the cause. |
Can someone attach a ll file and the opt commandline to reproduce the problem consistently? |
The problem really stems the following rule from http://llvm.org/docs/LangRef.html#pointer-aliasing-rules:
I think Nuno wrote up a good description somewhere, but essentially the problem is that we end up erasing dependencies, so alias analysis returns the wrong result. |
Is this the problem: inttoptr(ptrtoint(x)) + int_value where the int_value is actually a pointer but x is not. If the llvm folds the first, this becomes a getelementptr with x as the base, then the result value won't aliased with what int_value points to anymore? |
I think that this is essentially correct. The problem here is that inttoptr(ptrtoint(x)) only appears be based on x. However, in this case, the inttoptr is also control dependent on a comparison between some other pointer values. The resulting pointer, therefore, is based on all of these pointers. AA won't see this, just based on inttoptr(ptrtoint(x)) == x, because it will only see x. |
I can now reproduce the problem. An IR reproducible: llc good.ll > good.s ./a.out opt -early-cse-memssa -S < good.ll | llc > bad.s ./a.out opt -early-cse-memssa -S < good.ll > bad.ll The key diff between good.ll and bad.ll are < %add.ptr.x = select i1 %cmp, i32* %add.ptr, i32* %x
|
I suspect the answer is still yes because an answer of no is worse. So %add.ptr.x is defined as: %add.ptr = getelementptr inbounds [1 x i32], [1 x i32]* %y, i64 0, i64 1 And CSE is just applying the identity: q := (a != b ? b : a) => q := b and that certainly seems reasonable to me. FWIW, as currently defined, the icmp is currently well defined even for operands that point to different objects (we currently define the result in terms of the result of ptrtoint on each operand). |
That's precisely the problem though -- given what you said, the expression is actually "q := (ptrtoint(a) != ptrtoint(b) ? b : a)", which cannot be safely transformed to "q := b". |
I don't see the connection with ptrtoint/intptr. Whether the conditional assignment is optimized into q := b seems irrelevant. Suppose we have a local pointer analysis that can see through this and will know pi's value is y+1 (after the first select). I think the real problem is that the address y+1 is out of bounds of y, though there is no out of bounds access, the compiler can not assume that this address does not alias with other object, in this case 'x' (the second issue in CSE mentioned in comment #16) |
Indeed. Good point. We could say that this transformation is unsafe unless we know that both a and b have the same underlying object. |
Yes, we could define things this way. It would make our AA much less powerful (because we'd need to assume that any pointer we could not prove in bounds could potentially alias with all other pointers). We should find a different solution. |
But if ptrtoint(a) == ptrtoint(b) does not imply that a == b, that also seems unfortunate. I think, however, that I'd still prefer to restrict the inttoptr(ptrtoint(x)) folding to this. |
Re #21. Suppressing the optimization for this cases looks like paper overing the real problem. Regarding AA, I think in practice, the common case is that off-by-one address is used as sentinel value, so we only need to teach AA to handle this explicit out of bound case instead of making AA generally less conservative. Besides, we can teach AA to look at real references. If there are, we can assume OOB does not exist -- otherwise the behavior is UB. |
less conservative --> more conservative.
|
That's pretty fundamental, unfortunately. If we want to do high level AA inferences (like pointers derived off of two distinct malloc()'s do not alias) then we have be okay with ptrtoint(a) == ptrtoint(b) != (a == b).
I'm not sure if you meant "restricting inttoptr(ptrtoint(x))==x is more preferable" or the converse (the "however" threw me off :) ), but in any case, I think we have to restrict both. We can't assume ptrtoint(a) == ptrtoint(b) implies a == b for the reason stated above. And if we allow inttoptr(ptrtoint(x)) == x (but not "ptrtoint(a) == ptrtoint(b) implies a == b") then we have: i64 A = ... and we replace C with B (since we're dealing with integers here so our restriction does not apply), and the whole program was actually: // X and Y are 1 byte allocas where the address X+1 is the same as Y then the inttoptr(ptrtoint(x)) == x rule gives us: // X and Y are 1 byte allocas where the address X+1 is the same as Y which is an out of bounds access. |
I'm still not seeing how this causes a problem. If I have two pointers, a and b, and they don't alias, but ptrtoint(a) == ptrtoint(b), then they must have non-overlapping lifetimes. To have non-overlapping lifetimes, there must be some mutually-aliasing code in between. Note that our AA is only defined for pointers that are accessed. This is exactly why we can't use AA to reason about pointer comparisons. In fact, it may be the case that our AA queries are only well defined in general if some semantics-preserving transformation could put the two accesses next to each other. I've often thought this to be true, and perhaps this is why. |
The compiler does not create out of bound access.
I think we should not suppress optimizations, but instead fix the real underlying problems. |
agreed.
|
Okay, so, i admit to feeling confused why we've defined inttoptr and ptrtoint this way (IE to quantumly entangle other pointers, etc). I feel (probably wrongly), like the pointer aliasing definition should be something like "inttoptr only aliases the memory that is at the address of value passed to it". IE inttoptr(ptrtoint(x)) is a no-op from aliasing perspective. Whatever i can prove about the aliasing or non-alias of those objects is up to me. To prove anything about the aliasing result, i would need to take the pointer x, try to perform math on it, and see what the result is. Note also the result gcc 7 gives for the original example: bam. This has been like this as far back as i was lazy enough to try ;) ICC 2018 gives a correct result of a=100 x=0 |
It's probably not a good idea but I wanted to float the possibility of having the compiler look for cases where a one-past-the-end pointer might cause trouble and either add a bit of padding or else lay out the stack slots in a different order. |
(I feel like I may be rehashing things that have already been settled, please let me know if that's the case.) That's what I was asking though -- are there cases where folding "p == q" to "false" (assuming we can prove based_on(p) != based_on(q)) is important? Why can't we say that since we can't do provenance based comparisons at runtime, pointer comparisons are equivalent to comparing their bitwise representations at runtime? This prevents propagateEquality on pointer types though, but it looks like we're fine with that? |
I tried to catch LLVM in the act of making pointer comparison non-deterministic, but actually it seems to try hard to not do so. In particular, while it will fold "p+n == q" to false if that's the only use of p and qo, it no longer does so if p or q are escaped: https://godbolt.org/g/RD91og. So, is this "p+n == q --> false" actually handled similar to foldAllocaCmp? (I've always been wondering why that one should only handle alloca, anyway.) |
Are you assuming you can never determine q is a constant? Or are you also considering the case where q is constant? Because the obvious answer when q is constant nullptr is "null pointer checks". Past that, this is generally how value profiling would work to remove aliasing, etc, so we should be careful there is some way to make that work (IE comparison of p or q with constant pointers or ranges, usually) Past all that, when p and q is "guaranteed" to be non-constant, yeah, don't care in practice. |
For reference, these slides give an illustrated explanation on this bug: http://llvm.org/devmtg/2018-04/slides/Lopes-Sbirlea-Pointers,%20Alias%20and%20ModRef%20Analyses.pdf |
There is now a bunch of examples in bug 44313 and bug 44374. Mostly they provide full simple examples for aspects already touched in this bug in some way or another. But the example with
It did in the past but that was fixed in 2014 -- see bug 13507 and bug 21327.
It seems there was an effort[1][2] to fix the problem as it applies to devirtualization independently from the general case, and the particular testcase was fixed between clang 6 and 7. I've just filed bug 44472 with new testcases. [1] https://docs.google.com/document/d/16GVtCpzK8sIHNc2qZz6RN8amICNBtvjWUod2SujZVEo |
There was also another bug that involves restrict and int-ptr casting that were reported by us: #39193 define i8* @f(i8* nocapture %p, i8* %q) { (alive2 is still in progress; it does not support inttoptr and noalias yet, so wouldn't check your bug and this inttoptr bug, but I'll be happy if this tool can help determine when it's safe to replace pointer in the future).
This is very interesting. So LLVM (at least in 2014) regards such behavior as buggy. |
There are many thoughtful and thought-provoking comments here, thanks for this! But one comment has far-reaching consequences and especially resonated with my older thoughts:
This is a quite powerful principle.
|
I came here from https://blog.regehr.org/archives/1737 and can maybe add another perspective. It seems that the bug is the result of mixing two concepts of a pointer. First, as a symbolic reference to another value, second as the bit-pattern of a memory address. The former could also be represented as a tuple (baseptr,offset), the baseptr being the pointer's provenance. Pointers with different provenance cannot alias. The problem being that there exist identical address bit-patterns that represent different logical pointers, namely the address of the pointer one past the baseptr's memory allocation can be another allocation.[*] In the model of pointer as a logical reference, ptr2int is an implementation-defined non-injective (non-surjective) functions. That is, p!=q does not imply ptr2int(p)!=ptr2int(q) because of the validity of one-past-the-end pointers. Therefore, int2ptr is ambiguous: we cannot infer an unique (baseptr,offset) from the integer value. The conversion int2ptr(ptr2int(p)) therefore is 'lossy'. AliasAnalysis is linked to pointer comparison in that if we can deduce I hope this formalizes a bit what is also called 'magic', 'opaque' and 'non-deterministic' in this thread. Probably it's not just me who got confused. Clearly, we want the transformation I can think of two solutions:
In the second phase, pointers are assumed to be address bit-patterns, i.e. ptr2int is assumed on any pointers. This is a kind of lowering. AliasAnalysis may not assume provenance anymore.
Since provenance is language concept, it makes sense to not just assume the same thing in the IR, especially since there seem to be differences in C++ and C (comment #77). If my interpretation is correct, the icmp above is required to be an integer comparison in C11, but can be assumed to be false in C++. In any case, the provenance of %add.ptr.x as to be cleared ("assumed unknown") since it merges pointers of different provenance to a single pointer. Alternatively, provenance could be a set. Note that there is the special case of std::less: To make std::set/std::map work, it must be defined on pointers of different provenance. https://en.cppreference.com/w/cpp/utility/functional/less says:
In libc++, I don't see any special mechanism to make on all pointers (https://github.com/llvm/llvm-project/blob/master/libcxx/include/__functional_base#L54). This case might be interesting:
[*] We ignore that within a segmented memory model, or virtual page tables, multiple virtual memory addresses can be mapped to the same physical memory. |
Note that to just allocate another byte past the last element of arrays (comment #30) for stack variables and globals (and malloc implementations, in particular BumpPtrAllocator) would probably be easiest: It makes the int2ptr function injective. Similar the reason why the size of an empty struct is not zero and malloc(0) returns an unique allocation. |
Michael, my guess is that adding a byte (which of course becomes more than a byte when there are alignment requirements) is a showstopper for memory-constrained embedded use cases. Also note that (I think!) this would have to be done even for stack variables and globals that are not arrays. |
Just mentioning it would be easiest and there is precedence. For stack variables, I initially expected it to only necessary whenever there is pointer arithmetic involving non-constants which is only common for e.g. buffers. However, you're right it might be necessary for any captured variable; especially in C where there a pointer of a scalar variable is an array of size one and pass-by-reference is done by passing the pointer. There is an aspect of AliasAnalysis that I did not consider: it might only care about dereferenceable pointers. Since the one-past-the-last pointer is not dereferenceable, int2ptr is injective over the set of derefenceable pointers. Not sure whether LLVM's AliasAnalysis interface is only defined for valid pointer representations (ie including one-past-the-last) of just dereferenceable pointers. The latter would be hard for analysis since one had to proof whether a pointer is dereferenceable before an AA query. |
Also see https://people.mpi-sws.org/~jung/twinsem/twinsem.pdf for a detailed discussion of LLVM pointer semantics. |
The LLVM reference (except getelementptr inbounds) seem to miss concepts of valid pointer representation, pointer provenance or partial ordering of pointers. The ability of BasicAA to deduce pointers as non-aliasing based on provenance seem to be inferred from C/C++ semantics. From the semantics of icmp:
Sanjoy Das seem to have worked on non-integral pointer types which cannot be created "from thin air" by disallowing integer conversion instructions. |
The alias() API is defined for any pair of pointers which are both defined at some position in the program. The question is essentially, given two pointer values, for any location in the program where both pointer values are defined, is it possible that a store to one pointer would conflict with a load or store to the other pointer? The possible answers are essentially yes, no, or don't know. If it in't legal to dereference a pointer, it trivially doesn't alias anything: there are no memory accesses through the pointer, so there isn't any possible conflict. Similarly, if a pointer points to a constant, it doesn't alias anything: there are no stores through it.
icmp is defined to just take the raw pointer bits as an integer. This is described in LangRef. If the text of LangRef describing icmp isn't sufficiently clear, suggestions welcome. If some transform isn't consistent with this, please file a bug. For provenance, the rules we have are described at http://llvm.org/docs/LangRef.html#pointeraliasing . The problem described here is that the rules don't actually work. See also http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2311.pdf . |
Doesn't LLVM also assume that a fresh pointer returned by malloc does not equal any older pointer, and optimize icmp accordingly? They could physically be equal if addresses are reused (and that easily happens e.g. with jemalloc). C has a clause that pointers become undef when you deallocate the block they point to, but I have not seen such a clause in the LVLM semantics (but I might have just missed it). |
There's a class of optimizations where LLVM implicitly replaces malloc/new with an alloca, and we can optimize comparisons against that. In cases where that optimization isn't legal, LLVM doesn't make any assumptions, as far as I know. |
llvm/llvm-bugzilla-archive#46380 |
For those who have not seen this. WG14 now has a draft TS which addresses this topic: |
mentioned in issue #34577 |
mentioned in issue llvm/llvm-bugzilla-archive#37469 |
mentioned in issue #39193 |
mentioned in issue llvm/llvm-bugzilla-archive#46380 |
mentioned in issue llvm/llvm-bugzilla-archive#52570 |
FWIW: |
Extended Description
Example of an end-to-end miscompilation by clang of the following code involving ptrtoint:
$ cat c.c
#include <stdio.h>
void f(int*, int*);
int main()
{
int a=0, y[1], x = 0;
uintptr_t pi = (uintptr_t) &x;
uintptr_t yi = (uintptr_t) (y+1);
uintptr_t n = pi != yi;
if (n) {
a = 100;
pi = yi;
}
if (n) {
a = 100;
pi = (uintptr_t) y;
}
*(int *)pi = 15;
printf("a=%d x=%d\n", a, x);
f(&x,y);
return 0;
}
$ cat b.c
void f(intx, inty) {}
$ clang -O2 c.c b.c -o foo
$ ./foo
a=0 x=0
This result is wrong. The two possible outcomes are: a=0 x=15, and a=100 x=0.
The bug is in Instcombine that treats inttoptr(ptrtoint(x)) == x, which is incorrect. This transformation can only be done if x is dereferenceable for the accesses through inttoptr.
Compare the following:
clang -O0 -S -emit-llvm -o - c.c | opt -S -sroa
clang -O0 -S -emit-llvm -o - c.c | opt -S -sroa -instcombine
Integer compares are replaces with pointer compares (wrong) and load/stores are changed from inttoptr to pointers directly (also wrong).
Test case by Gil Hur.
The text was updated successfully, but these errors were encountered: