44536 – SLPVectorizer should drop nsw flags from add

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 44536 - SLPVectorizer should drop nsw flags from add

Summary: SLPVectorizer should drop nsw flags from add

Status:	RESOLVED FIXED

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Transformation Utilities (show other bugs)
Version:	trunk
Hardware:	PC All

Importance:	P normal
Assignee:	Unassigned LLVM Bugs

URL:
Keywords:

Depends on:
Blocks:

Reported:	2020-01-13 08:53 PST by Juneyoung Lee
Modified:	2020-01-31 07:17 PST (History)
CC List:	6 users (show)

See Also:
Fixed By Commit(s):	bc1148e7bcb0

Attachments
horizontal.ll (1.30 KB, text/plain) 2020-01-13 08:53 PST, Juneyoung Lee	Details
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Juneyoung Lee 2020-01-13 08:53:20 PST

Created attachment 23014 [details]
horizontal.ll

```
$ cat horizontal.ll # from excerpted from test/Transforms/SLPVectorizer/X86/horizontal.ll 
@arr_i32 = global [32 x i32] zeroinitializer, align 16
declare i32 @foobar(i32)

define void @i32_red_call(i32 %val) {
entry:
  %0 = load i32, i32* getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32, i64 0, i64 0), align 16
  %1 = load i32, i32* getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32, i64 0, i64 1), align 4
  %add = add nsw i32 %1, %0
  %2 = load i32, i32* getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32, i64 0, i64 2), align 8
  %add.1 = add nsw i32 %2, %add
  %3 = load i32, i32* getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32, i64 0, i64 3), align 4
  %add.2 = add nsw i32 %3, %add.1
  %4 = load i32, i32* getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32, i64 0, i64 4), align 16
  %add.3 = add nsw i32 %4, %add.2
  %5 = load i32, i32* getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32, i64 0, i64 5), align 4
  %add.4 = add nsw i32 %5, %add.3
  %6 = load i32, i32* getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32, i64 0, i64 6), align 8
  %add.5 = add nsw i32 %6, %add.4
  %7 = load i32, i32* getelementptr inbounds ([32 x i32], [32 x i32]* @arr_i32, i64 0, i64 7), align 4
  %add.6 = add nsw i32 %7, %add.5
  %res = call i32 @foobar(i32 %add.6)
  ret void
}
$ opt -slp-vectorizer -S -o - -mtriple=x86_64-apple-macosx -mcpu=corei7-avx ./horizontal.ll
@arr_i32 = global [32 x i32] zeroinitializer, align 16
declare i32 @foobar(i32) #0
define void @i32_red_call(i32 %val) #0 {
entry:
  %0 = load <8 x i32>, <8 x i32>* bitcast ([32 x i32]* @arr_i32 to <8 x i32>*), align 16
  %rdx.shuf = shufflevector <8 x i32> %0, <8 x i32> undef, <8 x i32> <i32 4, i32 5, i32 6, i32 7, i32 undef, i32 undef, i32 undef, i32 undef>
  %bin.rdx = add nsw <8 x i32> %0, %rdx.shuf
  %rdx.shuf1 = shufflevector <8 x i32> %bin.rdx, <8 x i32> undef, <8 x i32> <i32 2, i32 3, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef>
  %bin.rdx2 = add nsw <8 x i32> %bin.rdx, %rdx.shuf1
  %rdx.shuf3 = shufflevector <8 x i32> %bin.rdx2, <8 x i32> undef, <8 x i32> <i32 1, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef, i32 undef>
  %bin.rdx4 = add nsw <8 x i32> %bin.rdx2, %rdx.shuf3
  %1 = extractelement <8 x i32> %bin.rdx4, i32 0
  %res = call i32 @foobar(i32 %1)
  ret void
}

attributes #0 = { "target-cpu"="corei7-avx" }
```

SLPVectorizer reorders addition, so keeping nsw flag can result in introducing undefined behavior.
As reassociate does, nsw flags should be dropped.

Comment 1 Sanjay Patel 2020-01-31 07:17:30 PST

https://reviews.llvm.org/rGbc1148e7bcb0