LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 50253 - wrong code at -O2 and -O3 on x86_64-linux-gnu (generated code segfaults)
Summary: wrong code at -O2 and -O3 on x86_64-linux-gnu (generated code segfaults)
Status: RESOLVED DUPLICATE of bug 49661
Alias: None
Product: libraries
Classification: Unclassified
Component: Loop Optimizer (show other bugs)
Version: trunk
Hardware: PC All
: P enhancement
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-05-07 00:49 PDT by Zhendong Su
Modified: 2021-05-20 09:39 PDT (History)
3 users (show)

See Also:
Fixed By Commit(s): f34311c4024d


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Zhendong Su 2021-05-07 00:49:16 PDT
[515] % clangtk -v
clang version 13.0.0 (https://github.com/llvm/llvm-project.git f7294ac8093a2fbd8c00254580eaac6c4e1f7b24)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /local/suz-local/opfuzz/bin
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.5.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.5.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.5.0
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Candidate multilib: x32;@mx32
Selected multilib: .;@m64
[516] % 
[516] % clangtk -Os small.c; ./a.out
[517] % 
[517] % clangtk -O2 small.c
[518] % ./a.out
Segmentation fault
[519] % 
[519] % cat small.c
static int a, *b[10][9];
int main() {
  for (a = 0; a < 9; a++)
    b[5][a] = b[a+1][4];
  return 0;
}
Comment 1 Zhendong Su 2021-05-07 00:55:16 PDT
The following should be due to the same root cause, but it also fails at -Os:

[533] % clangtk -O1 small.c; ./a.out
[534] % 
[534] % clangtk -Os small.c; ./a.out
Segmentation fault
[535] % clangtk -O2 small.c; ./a.out
Segmentation fault
[536] % clangtk -O3 small.c; ./a.out
Segmentation fault
[537] % 
[537] % cat small.c
static int *a[2][7], b;
int main() {
  for (b = 0; b < 2; b++)
    a[1][b+1] = a[0][0];
  return 0;
}
Comment 2 Tee KOBAYASHI 2021-05-08 23:06:23 PDT
This reproduces for me with version 12.0.0. Not with 11.1.0.

The direct cause of the segfaults is MOVAPS to unaligned memory, as shown in the objdump:

0000000000401100 <main>:
  401100: 0f 57 c0                      xorps   %xmm0, %xmm0
  401103: 0f 29 05 36 2f 00 00          movaps  %xmm0, 12086(%rip)  # 404040 <b.5>
  40110a: 0f 29 05 3f 2f 00 00          movaps  %xmm0, 12095(%rip)  # 404050 <b.5+0x10>
  401111: 0f 29 05 50 2f 00 00          movaps  %xmm0, 12112(%rip)  # 404068 <b.5+0x28>
  401118: 0f 29 05 59 2f 00 00          movaps  %xmm0, 12121(%rip)  # 404078 <b.5+0x38>
  40111f: 31 c0                         xorl    %eax, %eax
  401121: c3                            retq
  401122: 66 2e 0f 1f 84 00 00 00 00 00 nopw    %cs:(%rax,%rax)
  40112c: 0f 1f 40 00                   nopl    (%rax)
Comment 3 Sanjay Patel 2021-05-14 12:17:45 PDT
I've never looked at -globalopt before, but this seems wrong:

@a = internal unnamed_addr global [2 x [7 x i32*]] zeroinitializer, align 16

define void @PR50253() {
  %x = load i32*, i32** getelementptr inbounds ([2 x [7 x i32*]], [2 x [7 x i32*]]* @a, i64 0, i64 0, i64 0), align 16
  store i32* %x, i32** getelementptr inbounds ([2 x [7 x i32*]], [2 x [7 x i32*]]* @a, i64 0, i64 1, i64 1), align 16
  store i32* %x, i32** getelementptr inbounds ([2 x [7 x i32*]], [2 x [7 x i32*]]* @a, i64 0, i64 1, i64 2), align 8
  ret void
}

------------------------------------------------------------------------

$ opt -globalopt align.ll -S -debug
Args: ./opt -globalopt align.ll -S -debug 
PERFORMING GLOBAL SRA ON: @a = internal unnamed_addr global [2 x [7 x i32*]] zeroinitializer, align 16
MARKING CONSTANT: @a.0 = internal unnamed_addr global [7 x i32*] zeroinitializer, align 16
PERFORMING GLOBAL SRA ON: @a.0 = internal unnamed_addr constant [7 x i32*] zeroinitializer, align 16
GLOBAL NEVER LOADED: @a.1 = internal unnamed_addr global [7 x i32*] zeroinitializer, align 16
MARKING CONSTANT: @a.0.0 = internal unnamed_addr global i32* null, align 16
   *** Marking constant allowed us to simplify all users and delete global!
GLOBAL NEVER LOADED: @a.1 = internal unnamed_addr global [7 x i32*] zeroinitializer, align 16
; ModuleID = 'align.ll'
source_filename = "align.ll"

@a.1 = internal unnamed_addr global [7 x i32*] zeroinitializer, align 16

define void @PR50253() local_unnamed_addr {
  store i32* null, i32** getelementptr inbounds ([7 x i32*], [7 x i32*]* @a.1, i32 0, i64 1), align 16
  store i32* null, i32** getelementptr inbounds ([7 x i32*], [7 x i32*]* @a.1, i32 0, i64 2), align 8
  ret void
}

------------------------------------------------------------------------

In the original code, we accessed a[1][1], and that's correctly shown as "align 16" because it's the 8th element past a[0][0] which is "align 16" at the declaration.

But in the transformed code, we created a.1 with "align 16" and accessed a.1[0][1] which can't be "align 16" .
Comment 4 Sanjay Patel 2021-05-15 06:10:16 PDT
https://reviews.llvm.org/D102552
Comment 5 Sanjay Patel 2021-05-20 09:39:24 PDT
This was the same problem as bug 49661.

https://reviews.llvm.org/rGf34311c4024d

*** This bug has been marked as a duplicate of bug 49661 ***