Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorized fp min/max reduction loop uses minnum/maxnum for horizontal reduction, but not loop body #42919

Open
topperc opened this issue Oct 5, 2019 · 0 comments
Labels
bugzilla Issues migrated from bugzilla loopoptim

Comments

@topperc
Copy link
Collaborator

topperc commented Oct 5, 2019

Bugzilla Link 43574
Version trunk
OS All
CC @hfinkel,@RKSimon,@rotateright

Extended Description

For a min/max reduction loop, the vectorizer adds fast math flags to the fcmp/select for the horizontal reduction after the loop. But the selects in the loop don't have fastmath flags. This leads to InstCombine creating fmaxnum/fminnum intrinsics for the horizontal reduction. X86 codegens both the loop body and the horizontal reduction to the FP min/max instructions so the difference didn't end up mattering. It just looked inconsistent.

clang -Ofast -march=skylake-avx512

float a[1024];

float foo() {
float min = 0.0;
for (int i = 0; i != 1024; ++i)
min = min < a[i] ? min : a[i];

return min;
}

https://godbolt.org/z/AMWYWm

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla loopoptim
Projects
None yet
Development

No branches or pull requests

1 participant