New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure to simplify trunc(abs(sext(x))) -> abs(x) #48160
Comments
assigned to @rotateright |
The lowering may be less efficient in some targets. But it won't be worse than the original code (worst case the target can just emit the old code). |
It's okay to drop the poison flag on abs during transforms, it's not particularly important. It's not like ctlz etc where it influences lowering. It only exists so we can infer that abs(x) >= 0. |
Regressed with introduction of new abs intrinsic? |
Yes. int8_t transform_abs_epi8(int8_t x) { Trunk: define signext i8 @transform_abs_epi8(i8 signext %0) { Clang 11: define signext i8 @transform_abs_epi8(i8 signext %0) { |
I can work on an instcombine for this. I'll see if min/max need a similar fold. |
Minimally, this combine starts from abs (don't need the trunc): define i32 @src(i8 %0) { define i32 @tgt(i8 %0) { |
https://reviews.llvm.org/rG411c144e4c99 min/max intrinsics should have a similar fold, but we'll need to account for binop variations like a constant operand vs. 2 {s/z}exts. |
Extended Description
Noticed while investigating the quality of vectorization for:
https://github.com/WojciechMula/toys/blob/master/autovectorization-tests/transform_abs.cpp
(see http://0x80.pl/notesen/2021-01-18-autovectorization-gcc-clang.html for the full list).
Due to promotions in the c code, for the "transform_abs_epi8" pattern we end up with this:
%20 = load <32 x i8>, <32 x i8>* %19
%21 = sext <32 x i8> %20 to <32 x i32>
%22 = call <32 x i32> @llvm.abs.v32i32(<32 x i32> %21, i1 true)
%23 = trunc <32 x i32> %22 to <32 x i8>
https://gcc.godbolt.org/z/qoYv6s
AFAICT we should be able to remove the sext/trunc and use a v32i8 abs as long as we clear the "is_int_min_poison" flag:
https://alive2.llvm.org/ce/z/Fvwhka
%23 = call <32 x i8> @llvm.abs.v32i8(<32 x i8> %20, i1 false)
What are the implications of clearing the poison flag as part of such a fold?
The text was updated successfully, but these errors were encountered: