Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[x86] crash and fail to match pmaddwd #49060

Closed
llvmbot opened this issue Mar 24, 2021 · 5 comments
Closed

[x86] crash and fail to match pmaddwd #49060

llvmbot opened this issue Mar 24, 2021 · 5 comments
Labels
backend:X86 bugzilla Issues migrated from bugzilla

Comments

@llvmbot
Copy link
Collaborator

llvmbot commented Mar 24, 2021

Bugzilla Link 49716
Resolution FIXED
Resolved on Mar 31, 2021 07:19
Version trunk
OS Windows NT
Attachments Non-reduced, un-preprocessed source
Reporter LLVM Bugzilla Contributor
CC @topperc,@RKSimon,@phoebewang,@rotateright,@oToToT
Fixed by commit(s) a283d72 e694e19

Extended Description

Here is a reduced test case:

#include <stdlib.h>
typedef union {
int16_t i16 attribute((vector_size(32)));
int32_t i32 attribute((vector_size(32)));
} simde__m256i_private;
simde__m256i_private simde__m256i_to_private();
int simde_mm256_madd_epi16() {
simde__m256i_private r_, a_ = simde__m256i_to_private(),
b_ = simde__m256i_to_private();
for (size_t i = 0; i < sizeof sizeof(r_); i += 2)
r_.i32[i] = a_.i16[i] * b_.i16[i] + a_.i16[i + 1] * b_.i16[i + 1];
simde__m256i_from_private(r_);
}

Compile with -O2 using clang (clang++ works) on x86_64. Godbolt link: https://godbolt.org/z/71o5hdY4h

The problem only manifests in my codebase with clang 12, but this test case seems to reliably reproduce the issue in earlier versions as well (back to 7 on godbolt).

I'm also attaching the original (non-reduced) source. Please let me know if you need any additional information.

@rotateright
Copy link
Contributor

This should prevent the crashing:
https://reviews.llvm.org/rGa283d7258360

But we want the output to be a "pmaddwd" instruction?

@rotateright
Copy link
Contributor

The matching code for this pattern was translated from a different/existing pattern here:
https://reviews.llvm.org/D49636

But it looks like we need to make more adjustments - extract subvector, not truncate? Also possible that the inputs are shorter vectors than the output?

@topperc
Copy link
Collaborator

topperc commented Mar 29, 2021

Is the "sizeof sizeof(r_)" in the for loop a mistake in your code that exposed the bug? It doesn't logically make sense.

@rotateright
Copy link
Contributor

More pmadd matching:
https://reviews.llvm.org/rGe694e19a7931

@rotateright
Copy link
Contributor

Resolving as fixed.

I added a test based on Craig's suggestion in https://reviews.llvm.org/D99531 that shows we could go even further to try to match pmadd, so if there's a real-world need for that, please file another bug.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

3 participants