Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra bitcasts prevent simplification of bit select #33394

Closed
RKSimon opened this issue Aug 3, 2017 · 5 comments
Closed

Extra bitcasts prevent simplification of bit select #33394

RKSimon opened this issue Aug 3, 2017 · 5 comments
Labels
bugzilla Issues migrated from bugzilla llvm:codegen

Comments

@RKSimon
Copy link
Collaborator

RKSimon commented Aug 3, 2017

Bugzilla Link 34047
Resolution FIXED
Resolved on Nov 23, 2021 07:56
Version trunk
OS Windows NT
CC @rotateright
Fixed by commit(s) c36b7e2

Extended Description

Two versions of the same 'select based on comparison' pattern, one using generic vector types and the other using SSE intrinsics.

The presence of the bitcasts between SSE types prevents the recognition of the select.

#include <x86intrin.h>

__v16qi cmp_i8_sel_v16i8(__v16qi a, __v16qi b, __v16qi c, __v16qi d) {
__v16qi cc = _mm_cmpeq_epi8(a, b);
return (c & ~cc) | (d & cc);
}

__m128i cmp_i8_sel_m128i(__m128i a, __m128i b, __m128i c, __m128i d) {
__m128i cc = _mm_cmpeq_epi8(a, b);
return _mm_or_si128(_mm_andnot_si128(cc, c), _mm_and_si128(cc, d));
}

define <16 x i8> @​cmp_i8_sel_v16i8(<16 x i8>, <16 x i8>, <16 x i8>, <16 x i8>) {
%5 = icmp eq <16 x i8> %0, %1
%6 = select <16 x i1> %5, <16 x i8> %3, <16 x i8> %2
ret <16 x i8> %6
}

define <2 x i64> @​cmp_i8_sel_m128i(<2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>) {
%5 = bitcast <2 x i64> %0 to <16 x i8>
%6 = bitcast <2 x i64> %1 to <16 x i8>
%7 = icmp eq <16 x i8> %5, %6
%8 = sext <16 x i1> %7 to <16 x i8>
%9 = bitcast <16 x i8> %8 to <2 x i64>
%10 = xor <2 x i64> %9, <i64 -1, i64 -1>
%11 = and <2 x i64> %10, %2
%12 = and <2 x i64> %9, %3
%13 = or <2 x i64> %11, %12
ret <2 x i64> %13
}

Mind you, on X86 at least it doesn't affect final codegen:

cmp_i8_sel_v16i8:
vpcmpeqb %xmm1, %xmm0, %xmm0
vpblendvb %xmm0, %xmm3, %xmm2, %xmm0
retq

cmp_i8_sel_m128i:
vpcmpeqb %xmm1, %xmm0, %xmm0
vpblendvb %xmm0, %xmm3, %xmm2, %xmm0
retq

@RKSimon
Copy link
Collaborator Author

RKSimon commented Oct 8, 2021

This is preventing abs/min/max style patterns from being detected:

https://godbolt.org/z/nbxeP7zhb

@rotateright
Copy link
Contributor

I think there's a missing or inverted canonicalization from ashr to select-of-constants for the 'abs' example, but it seems that we can see through that already with an improvement in peeking through bitcasts:
https://reviews.llvm.org/D113035

@rotateright
Copy link
Contributor

There's potentially still some remaining IR canonicalization and backend work needed to make sure these kinds of patterns are always handled, but the examples here are fixed after:
https://reviews.llvm.org/rGc36b7e21bd8f

@RKSimon
Copy link
Collaborator Author

RKSimon commented Nov 11, 2021

Thanks Sanjay!

@rotateright
Copy link
Contributor

Follow-up patch to match even more bitwise select patterns:
https://reviews.llvm.org/rG430ad9697d14

That should make the fix less fragile, but the multi-use 'TODO' item is still a concern. Ie, we could fail to match because the source is complicated by extra uses and the existing use checks are too strict.

There's also a missing canonicalization:
https://alive2.llvm.org/ce/z/iNkTZ5

Need to check which direction on that leads to less breakage...

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla llvm:codegen
Projects
None yet
Development

No branches or pull requests

2 participants