LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 24535 - wrong code at -O3 on x86_64-linux-gnu in 32-bit mode
Summary: wrong code at -O3 on x86_64-linux-gnu in 32-bit mode
Status: RESOLVED DUPLICATE of bug 25033
Alias: None
Product: libraries
Classification: Unclassified
Component: Backend: X86 (show other bugs)
Version: trunk
Hardware: PC All
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-08-21 11:38 PDT by Zhendong Su
Modified: 2015-12-08 15:05 PST (History)
5 users (show)

See Also:
Fixed By Commit(s):


Attachments
New (buggy) O3 output (13.24 KB, application/octet-stream)
2015-08-24 18:40 PDT, JF Bastien
Details
New O2 output (12.64 KB, application/octet-stream)
2015-08-24 18:41 PDT, JF Bastien
Details
Old O2 output (12.61 KB, application/octet-stream)
2015-08-24 18:41 PDT, JF Bastien
Details
Old O3 output (12.90 KB, application/octet-stream)
2015-08-24 18:41 PDT, JF Bastien
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Zhendong Su 2015-08-21 11:38:12 PDT
The following code is miscompiled by the current clang trunk on x86_64-linux-gnu at -O3 in 32-bit mode (but not in 64-bit mode).

This is a regression from 3.6.x. 

$ clang-trunk -v
clang version 3.8.0 (trunk 245675)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7.3
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.2
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5.1.0
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Candidate multilib: x32;@mx32
Selected multilib: .;@m64
$         
$ clang-trunk -m32 -O2 small.c; ./a.out
-1
$ clang-trunk -m64 -O3 small.c; ./a.out
-1
$ clang-3.6 -m32 -O3 small.c; ./a.out
-1
$ 
$ clang-trunk -m32 -O3 small.c
$ ./a.out
127
$ 


----------------------------------


int printf (const char *, ...);

int a, b, d, e, f, g;
char c;

void
fn1 (int p)
{
  for (b = 1; b; b = a)
    g = c ? c : p ? 0 : a;
}

int
main ()
{
  c--;
  fn1 (0);
  d--;
  a = e && f;
  d && (e = 0);
  printf ("%d\n", c);
  return 0;
}
Comment 1 David Majnemer 2015-08-21 23:34:29 PDT
Bisection points to r244503:
    x86: Emit LAHF/SAHF instead of PUSHF/POPF

    NaCl's sandbox doesn't allow PUSHF/POPF out of security concerns (priviledged emulators have forgotten to mask syste~

    As with the previous patch this code generation is pretty bad because it occurs very later, after register allocatio~

    I did [[ https://github.com/jfbastien/benchmark-x86-flags | a bit of benchmarking ]], the results on an Intel Haswel~

    | Time per call (ms)  | Runtime (ms) | Benchmark                      |
    | 0.000012514         |      6257    | sete.i386                      |
    | 0.000012810         |      6405    | sete.i386-fast                 |
    | 0.000010456         |      5228    | sete.x86-64                    |
    | 0.000010496         |      5248    | sete.x86-64-fast               |
    | 0.000012906         |      6453    | lahf-sahf.i386                 |
    | 0.000013236         |      6618    | lahf-sahf.i386-fast            |
    | 0.000010580         |      5290    | lahf-sahf.x86-64               |
    | 0.000010304         |      5152    | lahf-sahf.x86-64-fast          |
    | 0.000028056         |     14028    | pushf-popf.i386                |
    | 0.000027160         |     13580    | pushf-popf.i386-fast           |
    | 0.000023810         |     11905    | pushf-popf.x86-64              |
    | 0.000026468         |     13234    | pushf-popf.x86-64-fast         |
    
    Clearly `PUSHF`/`POPF` are suboptimal. It doesn't really seems to be worth teaching LLVM about individual flags, at ~
    


JF, can you take a look?
Comment 2 JF Bastien 2015-08-24 18:40:59 PDT
Created attachment 14769 [details]
New (buggy) O3 output
Comment 3 JF Bastien 2015-08-24 18:41:18 PDT
Created attachment 14770 [details]
New O2 output
Comment 4 JF Bastien 2015-08-24 18:41:35 PDT
Created attachment 14771 [details]
Old O2 output
Comment 5 JF Bastien 2015-08-24 18:41:50 PDT
Created attachment 14772 [details]
Old O3 output
Comment 6 JF Bastien 2015-08-24 18:48:30 PDT
Comparing the new (left) and old (right) code at O3, the only difference is here:

decl   0x804a024                  decl   0x804a024                
push   %eax                       pushf                           <<<
seto   %al                                                        <<<
lahf                                                              <<<
mov    %eax,%ecx                                                  <<<
pop    %eax                       pop    %ecx                     <<<
cmpl   $0x0,0x804a034             cmpl   $0x0,0x804a034           
setne  %dl                        setne  %dl                      
cmpl   $0x0,0x804a02c             cmpl   $0x0,0x804a02c           
setne  %ah                        setne  %ah                      
and    %dl,%ah                    and    %dl,%ah                  
movzbl %ah,%edx                   movzbl %ah,%edx                 
mov    %edx,0x804a03c             mov    %edx,0x804a03c           
mov    %ecx,%eax                  push   %ecx                     <<< 
add    $0x7f,%al                  popf                            <<<
sahf                                                              <<<
je     80484ff <main+0x8f>        je     80484f6 <main+0x86>                                      
movl   $0x0,0x804a034             movl   $0x0,0x804a034           
movsbl %al,%eax                   movsbl %al,%eax                 
mov    %eax,0x4(%esp)             mov    %eax,0x4(%esp)           
movl   $0x80485b0,(%esp)          movl   $0x80485a0,(%esp)        
call   80482f0 <printf@plt>       call   80482f0 <printf@plt>     
xor    %eax,%eax                  xor    %eax,%eax                
add    $0xc,%esp                  add    $0xc,%esp                
ret                               ret
Comment 7 JF Bastien 2015-12-03 11:02:59 PST
See potential dup:
  https://llvm.org/bugs/show_bug.cgi?id=25033#c7
Comment 8 Matthias Braun 2015-12-08 15:05:33 PST

*** This bug has been marked as a duplicate of bug 25033 ***