Vectorized fp min/max reduction loop uses minnum/maxnum for horizontal reduction, but not loop body #42919

topperc · 2019-10-05T17:11:09Z


Bugzilla Link	43574
Version	trunk
OS	All
CC	@hfinkel,@RKSimon,@rotateright

Extended Description

For a min/max reduction loop, the vectorizer adds fast math flags to the fcmp/select for the horizontal reduction after the loop. But the selects in the loop don't have fastmath flags. This leads to InstCombine creating fmaxnum/fminnum intrinsics for the horizontal reduction. X86 codegens both the loop body and the horizontal reduction to the FP min/max instructions so the difference didn't end up mattering. It just looked inconsistent.

clang -Ofast -march=skylake-avx512

float a[1024];

float foo() {
float min = 0.0;
for (int i = 0; i != 1024; ++i)
min = min < a[i] ? min : a[i];

return min;
}

https://godbolt.org/z/AMWYWm

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vectorized fp min/max reduction loop uses minnum/maxnum for horizontal reduction, but not loop body #42919

Vectorized fp min/max reduction loop uses minnum/maxnum for horizontal reduction, but not loop body #42919

topperc commented Oct 5, 2019

Vectorized fp min/max reduction loop uses minnum/maxnum for horizontal reduction, but not loop body #42919

Vectorized fp min/max reduction loop uses minnum/maxnum for horizontal reduction, but not loop body #42919

Comments

topperc commented Oct 5, 2019

Extended Description