The following code is miscompiled by the current clang trunk on x86_64-linux-gnu at -O3 in 32-bit mode (but not in 64-bit mode). This is a regression from 3.6.x. $ clang-trunk -v clang version 3.8.0 (trunk 245675) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/local/bin Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6.4 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7.3 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8.4 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.2 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5.1.0 Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9 Candidate multilib: .;@m64 Candidate multilib: 32;@m32 Candidate multilib: x32;@mx32 Selected multilib: .;@m64 $ $ clang-trunk -m32 -O2 small.c; ./a.out -1 $ clang-trunk -m64 -O3 small.c; ./a.out -1 $ clang-3.6 -m32 -O3 small.c; ./a.out -1 $ $ clang-trunk -m32 -O3 small.c $ ./a.out 127 $ ---------------------------------- int printf (const char *, ...); int a, b, d, e, f, g; char c; void fn1 (int p) { for (b = 1; b; b = a) g = c ? c : p ? 0 : a; } int main () { c--; fn1 (0); d--; a = e && f; d && (e = 0); printf ("%d\n", c); return 0; }
Bisection points to r244503: x86: Emit LAHF/SAHF instead of PUSHF/POPF NaCl's sandbox doesn't allow PUSHF/POPF out of security concerns (priviledged emulators have forgotten to mask syste~ As with the previous patch this code generation is pretty bad because it occurs very later, after register allocatio~ I did [[ https://github.com/jfbastien/benchmark-x86-flags | a bit of benchmarking ]], the results on an Intel Haswel~ | Time per call (ms) | Runtime (ms) | Benchmark | | 0.000012514 | 6257 | sete.i386 | | 0.000012810 | 6405 | sete.i386-fast | | 0.000010456 | 5228 | sete.x86-64 | | 0.000010496 | 5248 | sete.x86-64-fast | | 0.000012906 | 6453 | lahf-sahf.i386 | | 0.000013236 | 6618 | lahf-sahf.i386-fast | | 0.000010580 | 5290 | lahf-sahf.x86-64 | | 0.000010304 | 5152 | lahf-sahf.x86-64-fast | | 0.000028056 | 14028 | pushf-popf.i386 | | 0.000027160 | 13580 | pushf-popf.i386-fast | | 0.000023810 | 11905 | pushf-popf.x86-64 | | 0.000026468 | 13234 | pushf-popf.x86-64-fast | Clearly `PUSHF`/`POPF` are suboptimal. It doesn't really seems to be worth teaching LLVM about individual flags, at ~ JF, can you take a look?
Created attachment 14769 [details] New (buggy) O3 output
Created attachment 14770 [details] New O2 output
Created attachment 14771 [details] Old O2 output
Created attachment 14772 [details] Old O3 output
Comparing the new (left) and old (right) code at O3, the only difference is here: decl 0x804a024 decl 0x804a024 push %eax pushf <<< seto %al <<< lahf <<< mov %eax,%ecx <<< pop %eax pop %ecx <<< cmpl $0x0,0x804a034 cmpl $0x0,0x804a034 setne %dl setne %dl cmpl $0x0,0x804a02c cmpl $0x0,0x804a02c setne %ah setne %ah and %dl,%ah and %dl,%ah movzbl %ah,%edx movzbl %ah,%edx mov %edx,0x804a03c mov %edx,0x804a03c mov %ecx,%eax push %ecx <<< add $0x7f,%al popf <<< sahf <<< je 80484ff <main+0x8f> je 80484f6 <main+0x86> movl $0x0,0x804a034 movl $0x0,0x804a034 movsbl %al,%eax movsbl %al,%eax mov %eax,0x4(%esp) mov %eax,0x4(%esp) movl $0x80485b0,(%esp) movl $0x80485a0,(%esp) call 80482f0 <printf@plt> call 80482f0 <printf@plt> xor %eax,%eax xor %eax,%eax add $0xc,%esp add $0xc,%esp ret ret
See potential dup: https://llvm.org/bugs/show_bug.cgi?id=25033#c7
*** This bug has been marked as a duplicate of bug 25033 ***