Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid register allocation for AVX512 gather #41400

Closed
d0k opened this issue May 29, 2019 · 5 comments
Closed

Invalid register allocation for AVX512 gather #41400

d0k opened this issue May 29, 2019 · 5 comments
Labels
backend:X86 bugzilla Issues migrated from bugzilla

Comments

@d0k
Copy link
Member

d0k commented May 29, 2019

Bugzilla Link 42055
Resolution FIXED
Resolved on Jul 23, 2019 06:43
Version trunk
OS Linux
Attachments reduced test case
CC @topperc,@preames,@RKSimon,@rotateright
Fixed by commit(s) r362015

Extended Description

While chasing a SIGILL I came across this. Intrinsics were generated by LoopVectorizer. Manual says: "If any pair of the index, mask, or destination registers are the same, this instruction results a UD fault."

$ cat t.ll
define void @​reduced() #​0 {
%wide.masked.gather38 = call <16 x float> @​llvm.masked.gather.v16f32.v16p0f32(<16 x float*> undef, i32 4, <16 x i1> <i1 true, i1 true, i1 true, i1 true, i1 true, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false>, <16 x float> undef)
%1 = fadd fast <16 x float> %wide.masked.gather38, zeroinitializer
call void @​llvm.masked.store.v16f32.p0v16f32(<16 x float> %1, <16 x float>* undef, i32 4, <16 x i1> <i1 true, i1 true, i1 true, i1 true, i1 true, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false>)
ret void
}

; Function Attrs: nounwind readonly
declare <16 x float> @​llvm.masked.gather.v16f32.v16p0f32(<16 x float*>, i32 immarg, <16 x i1>, <16 x float>) #​1

; Function Attrs: argmemonly nounwind
declare void @​llvm.masked.store.v16f32.p0v16f32(<16 x float>, <16 x float>*, i32 immarg, <16 x i1>) #​2

attributes #​0 = { "unsafe-fp-math"="true" }
attributes #​1 = { nounwind readonly }
attributes #​2 = { argmemonly nounwind }

$ llc -mcpu=skx < t.ll | llvm-mc
:10:2: warning: index and destination registers should be distinct
vgatherqps (,%zmm0), %ymm0 {%k1}
^

@topperc
Copy link
Collaborator

topperc commented May 29, 2019

Have you seen this occur without the pointer operand being undef?

@d0k
Copy link
Member Author

d0k commented May 29, 2019

Unreduced test case
Bugpoint made them undef, I attached the unreduced test case.

@topperc
Copy link
Collaborator

topperc commented May 29, 2019

Looks like this gather ends up getting split into two gathers. Where the high one has all undef indices. Presumably this occurs because the mask for these elements is also all 0s.

%4 = load <16 x float>, <16 x float>** %3, align 8, !invariant.load !​0, !dereferenceable !​3, !align !​4
%5 = getelementptr inbounds [256 x float], [256 x float]
%1, i64 0, <16 x i64> <i64 0, i64 56, i64 112, i64 168, i64 224, i64 undef, i64 undef, i64 undef, i64 undef, i64 undef, i64 undef, i64 undef, i64 undef, i64 undef, i64 undef, i64 undef>
%wide.masked.gather = call <16 x float> @​llvm.masked.gather.v16f32.v16p0f32(<16 x float*> %5, i32 16, <16 x i1> <i1 true, i1 true, i1 true, i1 true, i1 true, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false, i1 false>, <16 x float> undef), !invariant.load !​0, !noalias !​5

The easiest fix might be to add a DAG combine to detect the all 0s mask and convert the gather into just the passthru result.

@RKSimon
Copy link
Collaborator

RKSimon commented May 30, 2019

Patch: https://reviews.llvm.org/D62613 (committed at: rL362015)

@RKSimon
Copy link
Collaborator

RKSimon commented Jul 23, 2019

Patch: https://reviews.llvm.org/D62613 (committed at: rL362015)

Resolving

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

3 participants