Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrong code at -Os on x86_64-linux-gnu in 32-bit mode #29554

Closed
zhendongsu opened this issue Aug 30, 2016 · 11 comments
Closed

wrong code at -Os on x86_64-linux-gnu in 32-bit mode #29554

zhendongsu opened this issue Aug 30, 2016 · 11 comments
Labels
bugzilla Issues migrated from bugzilla clang Clang issues not falling into any other category

Comments

@zhendongsu
Copy link

Bugzilla Link 30199
Resolution FIXED
Resolved on Sep 20, 2016 11:55
Version trunk
OS All
CC @majnemer,@hfinkel,@MatzeB

Extended Description

This is a regression from 3.8.x.

$ clang -v
clang version 4.0.0 (trunk 279985)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/clang-trunk/bin
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/4.9
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/4.9.3
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/5
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/5.3.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.4.7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7.3
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8.5
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.3
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5.3.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.1.1
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Candidate multilib: x32;@MX32
Selected multilib: .;@m64
$
$ clang -m32 -O1 small.c; ./a.out
$ clang-3.8.1 -m32 -Os small.c; ./a.out
$
$ clang -m32 -Os small.c
$ ./a.out
Aborted (core dumped)
$


int b, c, d, e;

int fn1 ()
{
int f;
for (;;)
{
int g = c;
if (b)
for (;;)
;
if (!d)
goto L1;
if (g)
break;
}
L1:
if (d)
return e;
return 2;
}

int main ()
{
if (!fn1 ())
__builtin_abort ();
return 0;
}

@majnemer
Copy link
Mannequin

majnemer mannequin commented Aug 31, 2016

Bisection points to:
Author: Matthias Braun matze@braunis.de
Date: Tue Jul 12 18:44:33 2016 +0000

BranchFolding: Use LivePhysReg to update live in lists.

Use LivePhysRegs with a backwards walking algorithm to update live in
lists, this way the results do not depend on the presence of kill flags
anymore.

This patch also reduces the number of registers added as live-in.
Previously all pristine registers as well as all sub registers of a
super register were added resulting in unnecessarily large live in
lists. This fixed llvm/llvm-project#25637 .

Differential Revision: http://reviews.llvm.org/D22027

Matthias, can you please take a look?

@MatzeB
Copy link
Contributor

MatzeB commented Sep 19, 2016

Sorry for the late response (got the report on vacation, missed it in my first mail filtering batch).

Anyway I cannot reproduce this with ToT (r281914). This may be because I am on x86_64-apple-darwin16.0.0 and not on linux, or it was fixed in the meantime. I will try
r279985 next.

@MatzeB
Copy link
Contributor

MatzeB commented Sep 19, 2016

Works for me as well on r279978. For reference:

build/bin/clang -v
clang version 4.0.0 (trunk 279978) (llvm/trunk 279985)
Target: x86_64-apple-darwin16.0.0
Thread model: posix
InstalledDir: /Users/mbraun/dev/public_llvm/build/bin
build/bin/clang -O3 -m32 small.c
./a.out

To continue investigating this I would need the .ll file produced by -m32 -Os -S -emit-llvm and ideally with assembly files produced by a working and broken compiler.

@MatzeB
Copy link
Contributor

MatzeB commented Sep 19, 2016

(Works for -Os as well:

build/bin/clang -Os -m32 small.c
./a.out

)

@zhendongsu
Copy link
Author

Matthias, please see below:

$ clang -v
clang version 4.0.0 (trunk 281848)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/clang-trunk/bin
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/4.9
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/4.9.3
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/5
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/5.3.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.4.7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7.3
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8.5
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.3
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5.3.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.1.1
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Candidate multilib: x32;@MX32
Selected multilib: .;@m64
$
$ clang -m32 -Os small.c
$ ./a.out
Aborted (core dumped)
$
$ clang -m32 -Os small.c -emit-llvm -S -Xclang -disable-llvm-optzns
$
$ cat small.ll
; ModuleID = 'small.c'
source_filename = "small.c"
target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
target triple = "i386-unknown-linux-gnu"

@​c = common global i32 0, align 4
@​b = common global i32 0, align 4
@​d = common global i32 0, align 4
@​e = common global i32 0, align 4

; Function Attrs: nounwind optsize
define i32 @​fn1() #​0 {
entry:
%retval = alloca i32, align 4
%f = alloca i32, align 4
%g = alloca i32, align 4
%cleanup.dest.slot = alloca i32
%0 = bitcast i32* %f to i8*
call void @​llvm.lifetime.start(i64 4, i8* %0) #​3
br label %for.cond

for.cond: ; preds = %cleanup.cont, %entry
%1 = bitcast i32* %g to i8*
call void @​llvm.lifetime.start(i64 4, i8* %1) #​3
%2 = load i32, i32* @​c, align 4, !tbaa !​1
store i32 %2, i32* %g, align 4, !tbaa !​1
%3 = load i32, i32* @​b, align 4, !tbaa !​1
%tobool = icmp ne i32 %3, 0
br i1 %tobool, label %if.then, label %if.end

if.then: ; preds = %for.cond
br label %for.cond1

for.cond1: ; preds = %for.cond1, %if.then
br label %for.cond1

if.end: ; preds = %for.cond
%4 = load i32, i32* @​d, align 4, !tbaa !​1
%tobool2 = icmp ne i32 %4, 0
br i1 %tobool2, label %if.end4, label %if.then3

if.then3: ; preds = %if.end
store i32 6, i32* %cleanup.dest.slot, align 4
br label %cleanup

if.end4: ; preds = %if.end
%5 = load i32, i32* %g, align 4, !tbaa !​1
%tobool5 = icmp ne i32 %5, 0
br i1 %tobool5, label %if.then6, label %if.end7

if.then6: ; preds = %if.end4
store i32 2, i32* %cleanup.dest.slot, align 4
br label %cleanup

if.end7: ; preds = %if.end4
store i32 0, i32* %cleanup.dest.slot, align 4
br label %cleanup

cleanup: ; preds = %if.then3, %if.end7, %if.then6
%6 = bitcast i32* %g to i8*
call void @​llvm.lifetime.end(i64 4, i8* %6) #​3
%cleanup.dest = load i32, i32* %cleanup.dest.slot, align 4
switch i32 %cleanup.dest, label %cleanup11 [
i32 0, label %cleanup.cont
i32 2, label %for.end
i32 6, label %L1
]

cleanup.cont: ; preds = %cleanup
br label %for.cond

for.end: ; preds = %cleanup
br label %L1

L1: ; preds = %for.end, %cleanup
%7 = load i32, i32* @​d, align 4, !tbaa !​1
%tobool8 = icmp ne i32 %7, 0
br i1 %tobool8, label %if.then9, label %if.end10

if.then9: ; preds = %L1
%8 = load i32, i32* @​e, align 4, !tbaa !​1
store i32 %8, i32* %retval, align 4
store i32 1, i32* %cleanup.dest.slot, align 4
br label %cleanup11

if.end10: ; preds = %L1
store i32 2, i32* %retval, align 4
store i32 1, i32* %cleanup.dest.slot, align 4
br label %cleanup11

cleanup11: ; preds = %if.end10, %if.then9, %cleanup
%9 = bitcast i32* %f to i8*
call void @​llvm.lifetime.end(i64 4, i8* %9) #​3
%10 = load i32, i32* %retval, align 4
ret i32 %10
}

; Function Attrs: argmemonly nounwind
declare void @​llvm.lifetime.start(i64, i8* nocapture) #​1

; Function Attrs: argmemonly nounwind
declare void @​llvm.lifetime.end(i64, i8* nocapture) #​1

; Function Attrs: nounwind optsize
define i32 @​main() #​0 {
entry:
%retval = alloca i32, align 4
store i32 0, i32* %retval, align 4
%call = call i32 @​fn1() #​4
%tobool = icmp ne i32 %call, 0
br i1 %tobool, label %if.end, label %if.then

if.then: ; preds = %entry
call void @​abort() #​5
unreachable

if.end: ; preds = %entry
ret i32 0
}

; Function Attrs: noreturn nounwind optsize
declare void @​abort() #​2

attributes #​0 = { nounwind optsize "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #​1 = { argmemonly nounwind }
attributes #​2 = { noreturn nounwind optsize "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #​3 = { nounwind }
attributes #​4 = { optsize }
attributes #​5 = { noreturn nounwind optsize }

!llvm.ident = !{#0}

!​0 = !{!"clang version 4.0.0 (trunk 281848)"}
!​1 = !{#2, !​2, i64 0}
!​2 = !{!"int", !​3, i64 0}
!​3 = !{!"omnipotent char", !​4, i64 0}
!​4 = !{!"Simple C/C++ TBAA"}
$

@MatzeB
Copy link
Contributor

MatzeB commented Sep 19, 2016

Could you also provide the .ll file without "-Xclang -disable-llvm-optzns"?

@zhendongsu
Copy link
Author

Here it is Matthias:

$ clang -m32 -Os small.c -emit-llvm -S
$ cat small.ll
; ModuleID = 'small.c'
source_filename = "small.c"
target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
target triple = "i386-unknown-linux-gnu"

@​c = common local_unnamed_addr global i32 0, align 4
@​b = common local_unnamed_addr global i32 0, align 4
@​d = common local_unnamed_addr global i32 0, align 4
@​e = common local_unnamed_addr global i32 0, align 4

; Function Attrs: norecurse nounwind optsize readonly
define i32 @​fn1() local_unnamed_addr #​0 {
entry:
%0 = load i32, i32* @​b, align 4, !tbaa !​1
%tobool = icmp eq i32 %0, 0
br i1 %tobool, label %entry.split, label %for.cond1.preheader

for.cond1.preheader: ; preds = %entry
br label %for.cond1

entry.split: ; preds = %entry
%1 = load i32, i32* @​d, align 4
%tobool2 = icmp eq i32 %1, 0
%2 = load i32, i32* @​c, align 4
%tobool5 = icmp eq i32 %2, 0
%. = select i1 %tobool5, i3 0, i3 2
%cleanup.dest.slot.0 = select i1 %tobool2, i3 -2, i3 %.
switch i3 %cleanup.dest.slot.0, label %cleanup11 [
i3 2, label %L1.split
i3 -2, label %L1.split
i3 0, label %infloop.preheader
]

infloop.preheader: ; preds = %entry.split
br label %infloop

for.cond1: ; preds = %for.cond1.preheader, %for.cond1
br label %for.cond1

L1.split: ; preds = %entry.split, %entry.split
%3 = load i32, i32* @​e, align 4
%.14 = select i1 %tobool2, i32 2, i32 %3
ret i32 %.14

cleanup11: ; preds = %entry.split
ret i32 undef

infloop: ; preds = %infloop.preheader, %infloop
br label %infloop
}

; Function Attrs: nounwind optsize
define i32 @​main() local_unnamed_addr #​1 {
entry:
%call = tail call i32 @​fn1() #​3
%tobool = icmp eq i32 %call, 0
br i1 %tobool, label %if.then, label %if.end

if.then: ; preds = %entry
tail call void @​abort() #​4
unreachable

if.end: ; preds = %entry
ret i32 0
}

; Function Attrs: noreturn nounwind optsize
declare void @​abort() local_unnamed_addr #​2

attributes #​0 = { norecurse nounwind optsize readonly "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #​1 = { nounwind optsize "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #​2 = { noreturn nounwind optsize "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="pentium4" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #​3 = { optsize }
attributes #​4 = { noreturn nounwind optsize }

!llvm.ident = !{#0}

!​0 = !{!"clang version 4.0.0 (trunk 281848)"}
!​1 = !{#2, !​2, i64 0}
!​2 = !{!"int", !​3, i64 0}
!​3 = !{!"omnipotent char", !​4, i64 0}
!​4 = !{!"Simple C/C++ TBAA"}
$

@MatzeB
Copy link
Contributor

MatzeB commented Sep 20, 2016

Control flow merging merges a
RET 0, %EAX
into
RET 0, %EAX
(without dropping the ), after that the dependency breaker code thinks it is fine to rename some EAX writing instructions to write to ECX.

Before my change we used to produce wildly conservative live-in lists with lots of registers that were not actually used later (including EAX) so the issue did not show up before.

@MatzeB
Copy link
Contributor

MatzeB commented Sep 20, 2016

Fixed in r281957. I will nominate the fix for 3.9.1

@MatzeB
Copy link
Contributor

MatzeB commented Nov 26, 2021

mentioned in issue llvm/llvm-bugzilla-archive#30261

@MatzeB
Copy link
Contributor

MatzeB commented Nov 26, 2021

mentioned in issue llvm/llvm-bugzilla-archive#30463

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla clang Clang issues not falling into any other category
Projects
None yet
Development

No branches or pull requests

2 participants