-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure to merge 2 * <2 x float> load + fadd and then split the result back to 2 * <2 x float> #41367
Comments
assigned to @anton-afanasyev |
Some similar codegen issue for a Matrix22 type: https://godbolt.org/z/gG1_GG
|
The root cause is in SLPVectorizer: struct {<2 x float>, <2 x float>} isn't recognized to be vectorizable as <4 x float>. Assigned to myself. |
Here is the fix for vector part of this issue: https://reviews.llvm.org/D70068 |
Here is one of the fix to support matrix vectorization: https://reviews.llvm.org/D70924 |
Another part: https://reviews.llvm.org/D70587 |
Follow-up change to support matrix vectorization: https://reviews.llvm.org/D72689 . It provides revectorization of partially vectorized code. |
Except first and last function, Clang trunk codegen looks worse and bigger than Clang 9. |
Clang9:
Trunk:
Technically VectorCombiner could fix a lot of this, but we should work out whats caused it. |
Hmm, I'm looking into this. |
Can someone bisect to see where the code quality regressed between clang 9 and 10? |
Similar to bug 45015, we should get the vector math ops back with:
...but we need to enhance VectorCombine to get ideal IR, and VectorCombine is independent of any regressions between clang9 and clang10. |
Yes, I've found that this regression is not related to the current PR and to the related commits (D70924, D70587, D72689). |
I don't think we can block the release on this. Unblocking. |
Some patches landed so… fixed? |
Some of these matrix cases still need attention |
Extended Description
Another missed vectorization opportunity for
https://godbolt.org/z/lbAN92
Not sure whether this can be handled in the SLP or DAG but we should be able to do something like:
The text was updated successfully, but these errors were encountered: