Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InstCombine cannot blindly assume that inttoptr(ptrtoint x) -> x #33896

Open
nunoplopes opened this issue Sep 11, 2017 · 113 comments
Open

InstCombine cannot blindly assume that inttoptr(ptrtoint x) -> x #33896

nunoplopes opened this issue Sep 11, 2017 · 113 comments
Labels

Comments

@nunoplopes
Copy link
Member

Bugzilla Link 34548
Version trunk
OS All
Blocks #34577 #39193
CC @comex,@majnemer,@vns-mn,@dwblaikie,@efriedma-quic,@fhahn,@hfinkel,@jensmaurer,@dobbelaj-snps,@aqjune,@RKSimon,@Meinersbur,@sunfishcode,@MattPD,@uecker,@RalfJung,@regehr,@rnk,@sanjoy,@rotateright,@yuanfang-chen

Extended Description

Example of an end-to-end miscompilation by clang of the following code involving ptrtoint:

$ cat c.c
#include <stdio.h>

void f(int*, int*);

int main()
{
int a=0, y[1], x = 0;
uintptr_t pi = (uintptr_t) &x;
uintptr_t yi = (uintptr_t) (y+1);
uintptr_t n = pi != yi;

if (n) {
a = 100;
pi = yi;
}

if (n) {
a = 100;
pi = (uintptr_t) y;
}

*(int *)pi = 15;

printf("a=%d x=%d\n", a, x);

f(&x,y);

return 0;
}

$ cat b.c
void f(intx, inty) {}

$ clang -O2 c.c b.c -o foo

$ ./foo
a=0 x=0

This result is wrong. The two possible outcomes are: a=0 x=15, and a=100 x=0.

The bug is in Instcombine that treats inttoptr(ptrtoint(x)) == x, which is incorrect. This transformation can only be done if x is dereferenceable for the accesses through inttoptr.
Compare the following:
clang -O0 -S -emit-llvm -o - c.c | opt -S -sroa
clang -O0 -S -emit-llvm -o - c.c | opt -S -sroa -instcombine

Integer compares are replaces with pointer compares (wrong) and load/stores are changed from inttoptr to pointers directly (also wrong).

Test case by Gil Hur.

@rnk
Copy link
Collaborator

rnk commented Sep 11, 2017

Personally I get "a=100 x=0" for this test case.

Does the soundness issue go away if we stop optimizing 'n' to 1? That seems like a much better fix, IMO. Integer/pointer casts are far more important to optimize through than

@regehr
Copy link
Contributor

regehr commented Sep 11, 2017

Reid, try this change:

int a = 0, y[1], x = 0;
//int a=0, x = 0, y[1];

@llvmbot
Copy link
Member

llvmbot commented Sep 11, 2017

Reid, try this change:

int a = 0, y[1], x = 0;
//int a=0, x = 0, y[1];

Doesn't seem to repro here as well (x86-64_linux) [same happens with John's suggested change].

[davide@cupiditate bin]$ ./clang c.c b.c -O1 -o foo && ./foo
a=100 x=0
[davide@cupiditate bin]$ ./clang c.c b.c -O2 -o foo && ./foo
a=100 x=0
[davide@cupiditate bin]$ ./clang c.c b.c -O3 -o foo && ./foo
a=100 x=0

@llvmbot
Copy link
Member

llvmbot commented Sep 11, 2017

Also

[davide@cupiditate bin]$ ./clang c.c -O0 -S -emit-llvm -o - | ./opt -S -sroa > pre.ll
[davide@cupiditate bin]$ ./clang c.c -O0 -S -emit-llvm -o - | ./opt -S -sroa -instcombine > post.ll
[davide@cupiditate bin]$ diff -u pre.ll post.ll
[davide@cupiditate bin]$

Presumably this might have been fixed/hidden? Which revision are you at?
I'm at clang version 6.0.0 (trunk 312918) (llvm/trunk 312925)

Side note, I had to add <stdint.h> to the code to get it compile.

@regehr
Copy link
Contributor

regehr commented Sep 11, 2017

Weird, I see "a=0 x=0" at -O2 using both 4.0 and 5.0 on OS X and Ubuntu 16.04 on x86-64.

@llvmbot
Copy link
Member

llvmbot commented Sep 11, 2017

Weird, I see "a=0 x=0" at -O2 using both 4.0 and 5.0 on OS X and Ubuntu
16.04 on x86-64.

Can you verify this still happen on ToT ? In the meanwhile, I'm going to try with 5.0 on my machine (see if that applies, and bisect in case).

@david-xl
Copy link
Contributor

I could not reproduce the problem either (same IR)

@regehr
Copy link
Contributor

regehr commented Sep 11, 2017

Johns-MacBook-Pro-2:code regehr$ clang -v
clang version 6.0.0 (trunk 312932)
Target: x86_64-apple-darwin16.7.0
Thread model: posix
InstalledDir: /Users/regehr/llvm-install/bin

Johns-MacBook-Pro-2:code regehr$ clang -O2 alias4.c alias4-b.c
Johns-MacBook-Pro-2:code regehr$ ./a.out
a=0 x=0

Johns-MacBook-Pro-2:code regehr$ cat alias4.c
#include <stdint.h>
#include <stdio.h>

void f(int *, int *);

int main() {
int a = 0, y[1], x = 0;
//int a=0, x = 0, y[1];
uintptr_t pi = (uintptr_t)&x;
uintptr_t yi = (uintptr_t)(y + 1);
uintptr_t n = pi != yi;

if (n) {
a = 100;
pi = yi;
}

if (n) {
a = 100;
pi = (uintptr_t)y;
}

*(int *)pi = 15;

printf("a=%d x=%d\n", a, x);

f(&x, y);

return 0;
}

Johns-MacBook-Pro-2:code regehr$ cat alias4-b.c
void f(int *x, int *y) {}

@efriedma-quic
Copy link
Collaborator

Does the soundness issue go away if we stop optimizing 'n' to 1?

I don't think we do that? At least, I can't reproduce that with trunk clang... and it's pretty clearly not a legal transform. See https://reviews.llvm.org/rL249490 .

@nunoplopes
Copy link
Member Author

Ok, so I guess some of you cannot reproduce this because your clang has a default option to enable some sanitization.
Try with clang -cc1 (adjust paths as needed):

$ ./bin/clang -cc1 -triple x86_64-pc-linux-gnu -emit-obj -O2 -o c.o c.c -internal-externc-isystem /usr/include/ -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-isystem /home/nuno/llvm-build/lib/clang/6.0.0/include

$ ./bin/clang -o foo b.c c.o

$ ./foo

@rnk
Copy link
Collaborator

rnk commented Sep 15, 2017

I was able to repro with John's reordering of the variables, but that doesn't seem that important.

Why can't we assume inttoptr(ptrtoint(x)) is a no-op? What's wrong with that in principle? "a=0 x=0" seems like a real bug, but I'm not convinced this is the cause.

@llvmbot
Copy link
Member

llvmbot commented Sep 15, 2017

Can someone attach a ll file and the opt commandline to reproduce the problem consistently?

@efriedma-quic
Copy link
Collaborator

The problem really stems the following rule from http://llvm.org/docs/LangRef.html#pointer-aliasing-rules:

  • A pointer value formed by an inttoptr is based on all pointer values that contribute (directly or indirectly) to the computation of the pointer’s value.

I think Nuno wrote up a good description somewhere, but essentially the problem is that we end up erasing dependencies, so alias analysis returns the wrong result.

@llvmbot
Copy link
Member

llvmbot commented Sep 15, 2017

The problem really stems the following rule from
http://llvm.org/docs/LangRef.html#pointer-aliasing-rules:

  • A pointer value formed by an inttoptr is based on all pointer values that
    contribute (directly or indirectly) to the computation of the pointer’s
    value.

I think Nuno wrote up a good description somewhere, but essentially the
problem is that we end up erasing dependencies, so alias analysis returns
the wrong result.

Is this the problem:

inttoptr(ptrtoint(x)) + int_value

where the int_value is actually a pointer but x is not.

If the llvm folds the first, this becomes a getelementptr with x as the base, then the result value won't aliased with what int_value points to anymore?

@hfinkel
Copy link
Collaborator

hfinkel commented Sep 15, 2017

The problem really stems the following rule from
http://llvm.org/docs/LangRef.html#pointer-aliasing-rules:

  • A pointer value formed by an inttoptr is based on all pointer values that
    contribute (directly or indirectly) to the computation of the pointer’s
    value.

I think Nuno wrote up a good description somewhere, but essentially the
problem is that we end up erasing dependencies, so alias analysis returns
the wrong result.

I think that this is essentially correct. The problem here is that inttoptr(ptrtoint(x)) only appears be based on x. However, in this case, the inttoptr is also control dependent on a comparison between some other pointer values. The resulting pointer, therefore, is based on all of these pointers. AA won't see this, just based on inttoptr(ptrtoint(x)) == x, because it will only see x.

@llvmbot
Copy link
Member

llvmbot commented Sep 15, 2017

I can now reproduce the problem. An IR reproducible:

llc good.ll > good.s
clang good.s f.o

./a.out
100, 0

opt -early-cse-memssa -S < good.ll | llc > bad.s
clang bad.s f.o

./a.out
0,0

opt -early-cse-memssa -S < good.ll > bad.ll

The key diff between good.ll and bad.ll are

< %add.ptr.x = select i1 %cmp, i32* %add.ptr, i32* %x
17c20
< %pi.1.in = select i1 %cmp, i32* %arraydecay, i32* %add.ptr.x

%pi.1.in = select i1 %cmp, i32* %arraydecay, i32* %add.ptr

And:

< %2 = load i32, i32* %x, align 4, !tbaa !​2
< %call = call i32 (i8*, ...) @​printf(i8* getelementptr inbounds ([11 x i8], [11 x i8]* @.str, i64 0, i64 0), i32 %a.1, i32 %2)

%call = call i32 (i8*, ...) @​printf(i8* getelementptr inbounds ([11 x i8], [11 x i8]* @.str, i64 0, i64 0), i32 %a.1, i32 0)

Looks like Early CSE wrongly

  1. commonizes the address of x and y+1 and makes %add.ptr.x to be y + 1
  2. With 1), pi does not look like alias with x anymore, so initial value of x of 0 gets propagated to printf.

Does it have anything to do with intptr/ptrint?

// good.ll:

@.str = private unnamed_addr constant [11 x i8] c"a=%d x=%d\0A\00", align 1

define i32 @​main() local_unnamed_addr #​0 {
entry:
%x = alloca i32, align 4
%y = alloca [1 x i32], align 4
%0 = bitcast i32* %x to i8*
call void @​llvm.lifetime.start.p0i8(i64 4, i8* %0) #​4
store i32 0, i32* %x, align 4, !tbaa !​2
%1 = bitcast [1 x i32]* %y to i8*
call void @​llvm.lifetime.start.p0i8(i64 4, i8* %1) #​4
%arraydecay = getelementptr inbounds [1 x i32], [1 x i32]* %y, i64 0, i64 0
%add.ptr = getelementptr inbounds [1 x i32], [1 x i32]* %y, i64 0, i64 1
%cmp = icmp ne i32* %x, %add.ptr
%add.ptr.x = select i1 %cmp, i32* %add.ptr, i32* %x
%. = select i1 %cmp, i32 100, i32 0
%pi.1.in = select i1 %cmp, i32* %arraydecay, i32* %add.ptr.x
%a.1 = select i1 %cmp, i32 100, i32 %.
store i32 15, i32* %pi.1.in, align 4, !tbaa !​2
%2 = load i32, i32* %x, align 4, !tbaa !​2
%call = call i32 (i8*, ...) @​printf(i8* getelementptr inbounds ([11 x i8], [11 x i8]* @.str, i64 0, i64 0), i32 %a.1, i32 %2)
call void @​Z1fPiS(i32* nonnull %x, i32* %arraydecay)
call void @​llvm.lifetime.end.p0i8(i64 4, i8* %1) #​4
call void @​llvm.lifetime.end.p0i8(i64 4, i8* %0) #​4
ret i32 0
}

; Function Attrs: argmemonly nounwind
declare void @​llvm.lifetime.start.p0i8(i64, i8* nocapture) #​1

; Function Attrs: nounwind
declare i32 @​printf(i8* nocapture readonly, ...) local_unnamed_addr #​2

declare void @​Z1fPiS(i32*, i32*) local_unnamed_addr #​3

; Function Attrs: argmemonly nounwind
declare void @​llvm.lifetime.end.p0i8(i64, i8* nocapture) #​1

attributes #​0 = { norecurse uwtable }
attributes #​1 = { argmemonly nounwind }
attributes #​2 = { nounwind }
attributes #​3 = { nounwind }
attributes #​4 = { nounwind }

!llvm.module.flags = !{#0}
!llvm.ident = !{#1}

!​0 = !{i32 1, !"wchar_size", i32 4}
!​1 = !{!"clang version 6.0.0 (trunk 312663)"}
!​2 = !{#3, !​3, i64 0}
!​3 = !{!"int", !​4, i64 0}
!​4 = !{!"omnipotent char", !​5, i64 0}
!​5 = !{!"Simple C++ TBAA"}

@hfinkel
Copy link
Collaborator

hfinkel commented Sep 16, 2017

I can now reproduce the problem. An IR reproducible:
...

Looks like Early CSE wrongly

  1. commonizes the address of x and y+1 and makes %add.ptr.x to be y + 1
  2. With 1), pi does not look like alias with x anymore, so initial value of
    x of 0 gets propagated to printf.

Does it have anything to do with intptr/ptrint?

I suspect the answer is still yes because an answer of no is worse. So %add.ptr.x is defined as:

%add.ptr = getelementptr inbounds [1 x i32], [1 x i32]* %y, i64 0, i64 1
%cmp = icmp ne i32* %x, %add.ptr
%add.ptr.x = select i1 %cmp, i32* %add.ptr, i32* %x

And CSE is just applying the identity:

q := (a != b ? b : a) => q := b

and that certainly seems reasonable to me. FWIW, as currently defined, the icmp is currently well defined even for operands that point to different objects (we currently define the result in terms of the result of ptrtoint on each operand).

@sanjoy
Copy link
Contributor

sanjoy commented Sep 16, 2017

(we currently define the result in terms of the result of ptrtoint
on each operand).

That's precisely the problem though -- given what you said, the expression is actually "q := (ptrtoint(a) != ptrtoint(b) ? b : a)", which cannot be safely transformed to "q := b".

@david-xl
Copy link
Contributor

I don't see the connection with ptrtoint/intptr.

Whether the conditional assignment is optimized into q := b seems irrelevant. Suppose we have a local pointer analysis that can see through this and will know pi's value is y+1 (after the first select).

I think the real problem is that the address y+1 is out of bounds of y, though there is no out of bounds access, the compiler can not assume that this address does not alias with other object, in this case 'x' (the second issue in CSE mentioned in comment #​16)

@hfinkel
Copy link
Collaborator

hfinkel commented Sep 16, 2017

(we currently define the result in terms of the result of ptrtoint
on each operand).

That's precisely the problem though -- given what you said, the expression
is actually "q := (ptrtoint(a) != ptrtoint(b) ? b : a)", which cannot be
safely transformed to "q := b".

Indeed. Good point. We could say that this transformation is unsafe unless we know that both a and b have the same underlying object.

@hfinkel
Copy link
Collaborator

hfinkel commented Sep 16, 2017

I don't see the connection with ptrtoint/intptr.

Whether the conditional assignment is optimized into q := b seems
irrelevant. Suppose we have a local pointer analysis that can see through
this and will know pi's value is y+1 (after the first select).

I think the real problem is that the address y+1 is out of bounds of y,
though there is no out of bounds access, the compiler can not assume that
this address does not alias with other object, in this case 'x' (the second
issue in CSE mentioned in comment #​16)

Yes, we could define things this way. It would make our AA much less powerful (because we'd need to assume that any pointer we could not prove in bounds could potentially alias with all other pointers). We should find a different solution.

@hfinkel
Copy link
Collaborator

hfinkel commented Sep 16, 2017

(we currently define the result in terms of the result of ptrtoint
on each operand).

That's precisely the problem though -- given what you said, the expression
is actually "q := (ptrtoint(a) != ptrtoint(b) ? b : a)", which cannot be
safely transformed to "q := b".

Indeed. Good point. We could say that this transformation is unsafe unless
we know that both a and b have the same underlying object.

But if ptrtoint(a) == ptrtoint(b) does not imply that a == b, that also seems unfortunate.

I think, however, that I'd still prefer to restrict the inttoptr(ptrtoint(x)) folding to this.

@david-xl
Copy link
Contributor

Re #​21. Suppressing the optimization for this cases looks like paper overing the real problem.

Regarding AA, I think in practice, the common case is that off-by-one address is used as sentinel value, so we only need to teach AA to handle this explicit out of bound case instead of making AA generally less conservative. Besides, we can teach AA to look at real references. If there are, we can assume OOB does not exist -- otherwise the behavior is UB.

@david-xl
Copy link
Contributor

Re #​21. Suppressing the optimization for this cases looks like paper overing
the real problem.

Regarding AA, I think in practice, the common case is that off-by-one
address is used as sentinel value, so we only need to teach AA to handle
this explicit out of bound case instead of making AA generally less
conservative.

less conservative --> more conservative.

Besides, we can teach AA to look at real references. If there
are, we can assume OOB does not exist -- otherwise the behavior is UB.

@sanjoy
Copy link
Contributor

sanjoy commented Sep 16, 2017

(we currently define the result in terms of the result of ptrtoint
on each operand).

That's precisely the problem though -- given what you said, the expression
is actually "q := (ptrtoint(a) != ptrtoint(b) ? b : a)", which cannot be
safely transformed to "q := b".

Indeed. Good point. We could say that this transformation is unsafe unless
we know that both a and b have the same underlying object.

But if ptrtoint(a) == ptrtoint(b) does not imply that a == b, that also
seems unfortunate.

That's pretty fundamental, unfortunately. If we want to do high level AA inferences (like pointers derived off of two distinct malloc()'s do not alias) then we have be okay with ptrtoint(a) == ptrtoint(b) != (a == b).

I think, however, that I'd still prefer to restrict the
inttoptr(ptrtoint(x)) folding to this.

I'm not sure if you meant "restricting inttoptr(ptrtoint(x))==x is more preferable" or the converse (the "however" threw me off :) ), but in any case, I think we have to restrict both. We can't assume ptrtoint(a) == ptrtoint(b) implies a == b for the reason stated above.

And if we allow inttoptr(ptrtoint(x)) == x (but not "ptrtoint(a) == ptrtoint(b) implies a == b") then we have:

i64 A = ...
i64 B = ...
i64 C = A == B ? A : B;

and we replace C with B (since we're dealing with integers here so our restriction does not apply), and the whole program was actually:

// X and Y are 1 byte allocas where the address X+1 is the same as Y
i64 A = ptrtoint(Y)
i64 B = ptrtoint(X+1)
i64 C = B // after the transformation
i8* C_ptr = inttoptr(C)
*C_ptr = 1

then the inttoptr(ptrtoint(x)) == x rule gives us:

// X and Y are 1 byte allocas where the address X+1 is the same as Y
i64 A = ptrtoint(Y)
i64 B = ptrtoint(X+1)
i64 C = B // after the transformation
i8* C_ptr = inttoptr(C)
*(X+1) = 1

which is an out of bounds access.

@hfinkel
Copy link
Collaborator

hfinkel commented Sep 16, 2017

That's pretty fundamental, unfortunately. If we want to do high level AA inferences (like pointers derived off of two distinct malloc()'s do not alias) then we have be okay with ptrtoint(a) == ptrtoint(b) != (a == b).

I'm still not seeing how this causes a problem. If I have two pointers, a and b, and they don't alias, but ptrtoint(a) == ptrtoint(b), then they must have non-overlapping lifetimes. To have non-overlapping lifetimes, there must be some mutually-aliasing code in between.

Note that our AA is only defined for pointers that are accessed. This is exactly why we can't use AA to reason about pointer comparisons.

In fact, it may be the case that our AA queries are only well defined in general if some semantics-preserving transformation could put the two accesses next to each other. I've often thought this to be true, and perhaps this is why.

@david-xl
Copy link
Contributor

(we currently define the result in terms of the result of ptrtoint
on each operand).

That's precisely the problem though -- given what you said, the expression
is actually "q := (ptrtoint(a) != ptrtoint(b) ? b : a)", which cannot be
safely transformed to "q := b".

Indeed. Good point. We could say that this transformation is unsafe unless
we know that both a and b have the same underlying object.

But if ptrtoint(a) == ptrtoint(b) does not imply that a == b, that also
seems unfortunate.

That's pretty fundamental, unfortunately. If we want to do high level AA
inferences (like pointers derived off of two distinct malloc()'s do not
alias) then we have be okay with ptrtoint(a) == ptrtoint(b) != (a == b).

I think, however, that I'd still prefer to restrict the
inttoptr(ptrtoint(x)) folding to this.

I'm not sure if you meant "restricting inttoptr(ptrtoint(x))==x is more
preferable" or the converse (the "however" threw me off :) ), but in any
case, I think we have to restrict both. We can't assume ptrtoint(a) ==
ptrtoint(b) implies a == b for the reason stated above.

And if we allow inttoptr(ptrtoint(x)) == x (but not "ptrtoint(a) ==
ptrtoint(b) implies a == b") then we have:

i64 A = ...
i64 B = ...
i64 C = A == B ? A : B;

and we replace C with B (since we're dealing with integers here so our
restriction does not apply), and the whole program was actually:

// X and Y are 1 byte allocas where the address X+1 is the same as Y
i64 A = ptrtoint(Y)
i64 B = ptrtoint(X+1)
i64 C = B // after the transformation
i8* C_ptr = inttoptr(C)
*C_ptr = 1

then the inttoptr(ptrtoint(x)) == x rule gives us:

// X and Y are 1 byte allocas where the address X+1 is the same as Y
i64 A = ptrtoint(Y)
i64 B = ptrtoint(X+1)
i64 C = B // after the transformation
i8* C_ptr = inttoptr(C)
*(X+1) = 1

which is an out of bounds access.

The compiler does not create out of bound access.

  1. when A != B, the original program has out of bound access
  2. and A == B, the generated code is sementially equivalent to the original program. It is like all locals can be accessed via the same base %rsp.

I think we should not suppress optimizations, but instead fix the real underlying problems.

@david-xl
Copy link
Contributor

That's pretty fundamental, unfortunately. If we want to do high level AA inferences (like pointers derived off of two distinct malloc()'s do not alias) then we have be okay with ptrtoint(a) == ptrtoint(b) != (a == b).

I'm still not seeing how this causes a problem. If I have two pointers, a
and b, and they don't alias, but ptrtoint(a) == ptrtoint(b), then they must
have non-overlapping lifetimes. To have non-overlapping lifetimes, there
must be some mutually-aliasing code in between.

agreed.

Note that our AA is only defined for pointers that are accessed. This is
exactly why we can't use AA to reason about pointer comparisons.

In fact, it may be the case that our AA queries are only well defined in
general if some semantics-preserving transformation could put the two
accesses next to each other. I've often thought this to be true, and perhaps
this is why.

@llvmbot
Copy link
Member

llvmbot commented Sep 16, 2017

Okay, so, i admit to feeling confused why we've defined inttoptr and ptrtoint this way (IE to quantumly entangle other pointers, etc).

I feel (probably wrongly), like the pointer aliasing definition should be something like "inttoptr only aliases the memory that is at the address of value passed to it".

IE inttoptr(ptrtoint(x)) is a no-op from aliasing perspective. Whatever i can prove about the aliasing or non-alias of those objects is up to me.

To prove anything about the aliasing result, i would need to take the pointer x, try to perform math on it, and see what the result is.

Note also the result gcc 7 gives for the original example:
╰─ gcc-7 -O3 foo.c b.c
╰─ ./a.out
a=0 x=0

bam.

This has been like this as far back as i was lazy enough to try ;)

ICC 2018 gives a correct result of a=100 x=0
TL;DR Given both gcc and clang fail at this, and have for years, i'm not sure anyone expects a correct result here :)

@regehr
Copy link
Contributor

regehr commented Sep 16, 2017

It's probably not a good idea but I wanted to float the possibility of having the compiler look for cases where a one-past-the-end pointer might cause trouble and either add a bit of padding or else lay out the stack slots in a different order.

@Meinersbur
Copy link
Member

The LLVM reference (except getelementptr inbounds) seem to miss concepts of valid pointer representation, pointer provenance or partial ordering of pointers. The ability of BasicAA to deduce pointers as non-aliasing based on provenance seem to be inferred from C/C++ semantics.

From the semantics of icmp:

If the operands are pointer typed, the pointer values are compared as if they were integers.

Sanjoy Das seem to have worked on non-integral pointer types which cannot be created "from thin air" by disallowing integer conversion instructions.
http://llvm.org/docs/LangRef.html#nointptrtype

@efriedma-quic
Copy link
Collaborator

Not sure whether LLVM's AliasAnalysis interface is only defined for valid
pointer representations (ie including one-past-the-last) of just
dereferenceable pointers. The latter would be hard for analysis since one
had to proof whether a pointer is dereferenceable before an AA query.

The alias() API is defined for any pair of pointers which are both defined at some position in the program. The question is essentially, given two pointer values, for any location in the program where both pointer values are defined, is it possible that a store to one pointer would conflict with a load or store to the other pointer? The possible answers are essentially yes, no, or don't know.

If it in't legal to dereference a pointer, it trivially doesn't alias anything: there are no memory accesses through the pointer, so there isn't any possible conflict. Similarly, if a pointer points to a constant, it doesn't alias anything: there are no stores through it.

The LLVM reference (except getelementptr inbounds) seem to miss concepts of
valid pointer representation, pointer provenance or partial ordering of
pointers. The ability of BasicAA to deduce pointers as non-aliasing based on
provenance seem to be inferred from C/C++ semantics.

icmp is defined to just take the raw pointer bits as an integer. This is described in LangRef. If the text of LangRef describing icmp isn't sufficiently clear, suggestions welcome. If some transform isn't consistent with this, please file a bug.

For provenance, the rules we have are described at http://llvm.org/docs/LangRef.html#pointeraliasing . The problem described here is that the rules don't actually work.

See also http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2311.pdf .

@RalfJung
Copy link
Contributor

icmp is defined to just take the raw pointer bits as an integer. This is described in LangRef.

Doesn't LLVM also assume that a fresh pointer returned by malloc does not equal any older pointer, and optimize icmp accordingly? They could physically be equal if addresses are reused (and that easily happens e.g. with jemalloc). C has a clause that pointers become undef when you deallocate the block they point to, but I have not seen such a clause in the LVLM semantics (but I might have just missed it).

@efriedma-quic
Copy link
Collaborator

icmp is defined to just take the raw pointer bits as an integer. This is described in LangRef.

Doesn't LLVM also assume that a fresh pointer returned by malloc does not
equal any older pointer, and optimize icmp accordingly?

There's a class of optimizations where LLVM implicitly replaces malloc/new with an alloca, and we can optimize comparisons against that. In cases where that optimization isn't legal, LLVM doesn't make any assumptions, as far as I know.

@RalfJung
Copy link
Contributor

If some transform isn't consistent with this, please file a bug.

llvm/llvm-bugzilla-archive#46380

@uecker
Copy link
Mannequin

uecker mannequin commented Apr 3, 2021

For those who have not seen this. WG14 now has a draft TS which addresses this topic:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2676.pdf

@RalfJung
Copy link
Contributor

mentioned in issue #34577

@aqjune
Copy link
Contributor

aqjune commented Nov 27, 2021

mentioned in issue llvm/llvm-bugzilla-archive#37469

@aqjune
Copy link
Contributor

aqjune commented Nov 27, 2021

mentioned in issue #39193

@RalfJung
Copy link
Contributor

mentioned in issue llvm/llvm-bugzilla-archive#46380

@dwblaikie
Copy link
Collaborator

mentioned in issue llvm/llvm-bugzilla-archive#52570

@aqjune
Copy link
Contributor

aqjune commented Apr 20, 2022

FWIW: opt now has a --disable-i2p-p2i-opt flag that disables the roundtrip cast.

@gonzalobg
Copy link
Contributor

Would it make sense to have a ptr2int2ptr(x) instruction that allows InstCombine to merge ptr2int(int2ptr(x)) into a single operation, that can be then lowered by a backend more efficiently, without dropping the ptr2int/int2ptr casts (which I agree is unsound) ?

@RalfJung
Copy link
Contributor

What is the semantics of that operation? Why would a frontend not just emit x instead of ptr2int2ptr(x), if that operation has a semantics that permits such a transformation?

@gonzalobg
Copy link
Contributor

Depending on your provenance model, ptr2int2ptr(x) would have the effect of escaping the pointer (e.g. in a PNVI-ae-udi-like model), while x would not (preventing some compiler reorderings that x would not).

@RalfJung
Copy link
Contributor

Depending on your provenance model,

You mean, depending on which choice LLVM makes for its provenance mode? LLVM IR is a language with its own semantics, decisions like the provenance model have to be made by the LLVM project (as they require adjusting the optimizations to be conformant). Frontends don't get to choose the LLVM provenance model. Frontend languages will have their own provenance model and they have to ensure that it can be mapped into LLVM's (similar to how e.g. frontends have to map their notion of 'uninitialized memory' to what LLVM does).

ptr2int2ptr(x) would have the effect of escaping the pointer (e.g. in a PNVI-ae-udi-like model), while x would not (preventing some compiler reorderings that x would not).

That sounds like y = ptr2int2ptr(x); would be equivalent to ptr2int(x); y = x;. No need for a new operation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests