New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WRONG code CFGOpt? InstCombine? #49032
Comments
assigned to @aqjune |
A reduced input:
It's again due to the old transformation in InstCombine 'select i1 %a, i1 true, i1 %b' => 'or i1 %a, %b'. The transformation should be disabled; I'm sending patches to minimize performance impact after disabling it (https://reviews.llvm.org/D93065). |
I think I saw this again: define dso_local signext i32 @f(i32 signext %g, i32 zeroext %h) local_unnamed_addr #0 { => *** IR Dump After InstCombinePass *** This doesn't look right - the check if g < 0 is simply removed for some reason. opt -O2 fun.ll -march=z14 Input: lor.rhs: ; preds = %entry lor.end: ; preds = %lor.rhs, %entry |
|
I'll make a patch that conditionally disables select -> and/or transformation. I think now it's time to start ditching out the optimization. |
Extended Description
This program (wrong0.i):
static int *a;
int b, d;
int *c = &b;
int main() {
int e[3] = {2, 1, -2139220656};
a = &e[0];
d = (e[2] < 0) || (e[2] > (7 >> e[2]));
*c = d;
printf("%d\n", b);
}
I beleive the right-shift with a known negative value is undefined, but since the first expression is always true, the program as a whole is still well-defined, or?
Since e[2] is always less than 0, I would expect the second expression (after the '||') to never be evaluated. However, SimplifyCFGPass does not understand that the first expression is always true and so merges them so that they are always both executed. I wonder why this is allowed...?
%1 = load i32, i32* %arrayidx1, align 4, !tbaa !6
%cmp = icmp slt i32 %1, 0
%shr = ashr i32 7, %1
%cmp4 = icmp sgt i32 %1, %shr
%2 = select i1 %cmp, i1 true, i1 %cmp4
Then InstCombine decides to remove the first check against less than zero:
%arrayidx1 = getelementptr inbounds [3 x i32], [3 x i32]* %e, i64 0, i64 2
%1 = load i32, i32* %arrayidx1, align 4, !tbaa !6
%shr = lshr i32 7, %1
%2 = icmp ugt i32 %1, %shr
%lor.ext = zext i1 %2 to i32
store i32 %lor.ext, i32* @d, align 4, !tbaa !6
So it seems that the user has written a well-defined program, but the compiler has failed to realize this through the known constants.
clang -march=arch13 -O1 wrong0.i -o a.out -w; ./a.out
1
clang -march=arch13 -O2 wrong0.i -o a.out -w; ./a.out
0
clang -march=arch13 -O2 wrong0.i -o a.out -mllvm -enable-gvn-memdep=false -w; ./a.out
1
clang -march=arch13 -O2 wrong0.i -o a.out -mllvm -memssa-check-limit=0 -w; ./a.out
0
The store of '0' to @d is then later produced by GVN:
*** IR Dump After MergedLoadStoreMotionPass ***
; Function Attrs: nofree nounwind
define dso_local signext i32 @main() local_unnamed_addr #0 {
entry:
%e = alloca [3 x i32], align 4
%0 = bitcast [3 x i32]* %e to i8*
call void @llvm.lifetime.start.p0i8(i64 12, i8* nonnull %0) #2
call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 4 dereferenceable(12) %0, i8* noundef nonnull align 4 dereferenceable(12) bitcast ([3 x i32]* @__const.main.e to i8*), i64 12, i1 false)
%arrayidx = getelementptr inbounds [3 x i32], [3 x i32]* %e, i64 0, i64 0
store i32* %arrayidx, i32** @a, align 8, !tbaa !2
%arrayidx1 = getelementptr inbounds [3 x i32], [3 x i32]* %e, i64 0, i64 2
%1 = load i32, i32* %arrayidx1, align 4, !tbaa !6
%shr = lshr i32 7, %1
%2 = icmp ugt i32 %1, %shr
%lor.ext = zext i1 %2 to i32
store i32 %lor.ext, i32* @d, align 4, !tbaa !6
%3 = load i32*, i32** @c, align 8, !tbaa !2
store i32 %lor.ext, i32* %3, align 4, !tbaa !6
%4 = load i32, i32* @b, align 4, !tbaa !6
%call = call signext i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i32 signext %4)
call void @llvm.lifetime.end.p0i8(i64 12, i8* nonnull %0) #2
ret i32 0
}
*** IR Dump After GVN ***
; Function Attrs: nofree nounwind
define dso_local signext i32 @main() local_unnamed_addr #0 {
entry:
%e = alloca [3 x i32], align 4
%0 = bitcast [3 x i32]* %e to i8*
call void @llvm.lifetime.start.p0i8(i64 12, i8* nonnull %0) #2
call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 4 dereferenceable(12) %0, i8* noundef nonnull align 4 dereferenceable(12) bitcast ([3 x i32]* @__const.main.e to i8*), i64 12, i1 false)
%arrayidx = getelementptr inbounds [3 x i32], [3 x i32]* %e, i64 0, i64 0
store i32* %arrayidx, i32** @a, align 8, !tbaa !2
%arrayidx1 = getelementptr inbounds [3 x i32], [3 x i32]* %e, i64 0, i64 2
store i32 0, i32* @d, align 4, !tbaa !6
%1 = load i32*, i32** @c, align 8, !tbaa !2
store i32 0, i32* %1, align 4, !tbaa !6
%2 = load i32, i32* @b, align 4, !tbaa !6
%call = call signext i32 (i8*, ...) @printf(i8* nonnull dereferenceable(1) getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i64 0, i64 0), i32 signext %2)
call void @llvm.lifetime.end.p0i8(i64 12, i8* nonnull %0) #2
ret i32 0
}
The text was updated successfully, but these errors were encountered: