LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 34047 - Extra bitcasts prevent simplification of bit select
Summary: Extra bitcasts prevent simplification of bit select
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Common Code Generator Code (show other bugs)
Version: trunk
Hardware: PC Windows NT
: P enhancement
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-08-03 09:09 PDT by Simon Pilgrim
Modified: 2021-11-23 07:56 PST (History)
2 users (show)

See Also:
Fixed By Commit(s): c36b7e21bd8f


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Pilgrim 2017-08-03 09:09:06 PDT
Two versions of the same 'select based on comparison' pattern, one using generic vector types and the other using SSE intrinsics.

The presence of the bitcasts between SSE types prevents the recognition of the select.

#include <x86intrin.h>

__v16qi cmp_i8_sel_v16i8(__v16qi a, __v16qi b, __v16qi c, __v16qi d) {
  __v16qi cc = _mm_cmpeq_epi8(a, b);
  return (c & ~cc) | (d & cc);
}

__m128i cmp_i8_sel_m128i(__m128i a, __m128i b, __m128i c, __m128i d) {
  __m128i cc = _mm_cmpeq_epi8(a, b);
  return _mm_or_si128(_mm_andnot_si128(cc, c), _mm_and_si128(cc, d));
}

define <16 x i8> @cmp_i8_sel_v16i8(<16 x i8>, <16 x i8>, <16 x i8>, <16 x i8>) {
  %5 = icmp eq <16 x i8> %0, %1
  %6 = select <16 x i1> %5, <16 x i8> %3, <16 x i8> %2
  ret <16 x i8> %6
}

define <2 x i64> @cmp_i8_sel_m128i(<2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>) {
  %5 = bitcast <2 x i64> %0 to <16 x i8>
  %6 = bitcast <2 x i64> %1 to <16 x i8>
  %7 = icmp eq <16 x i8> %5, %6
  %8 = sext <16 x i1> %7 to <16 x i8>
  %9 = bitcast <16 x i8> %8 to <2 x i64>
  %10 = xor <2 x i64> %9, <i64 -1, i64 -1>
  %11 = and <2 x i64> %10, %2
  %12 = and <2 x i64> %9, %3
  %13 = or <2 x i64> %11, %12
  ret <2 x i64> %13
}

Mind you, on X86 at least it doesn't affect final codegen:

cmp_i8_sel_v16i8:
  vpcmpeqb %xmm1, %xmm0, %xmm0
  vpblendvb %xmm0, %xmm3, %xmm2, %xmm0
  retq

cmp_i8_sel_m128i:
  vpcmpeqb %xmm1, %xmm0, %xmm0
  vpblendvb %xmm0, %xmm3, %xmm2, %xmm0
  retq
Comment 1 Simon Pilgrim 2021-10-08 03:41:54 PDT
This is preventing abs/min/max style patterns from being detected: 

https://godbolt.org/z/nbxeP7zhb
Comment 2 Sanjay Patel 2021-11-03 05:40:10 PDT
I think there's a missing or inverted canonicalization from ashr to select-of-constants for the 'abs' example, but it seems that we can see through that already with an improvement in peeking through bitcasts:
https://reviews.llvm.org/D113035
Comment 3 Sanjay Patel 2021-11-10 10:41:13 PST
There's potentially still some remaining IR canonicalization and backend work needed to make sure these kinds of patterns are always handled, but the examples here are fixed after:
https://reviews.llvm.org/rGc36b7e21bd8f
Comment 4 Simon Pilgrim 2021-11-11 02:33:44 PST
Thanks Sanjay!
Comment 5 Sanjay Patel 2021-11-23 07:56:44 PST
Follow-up patch to match even more bitwise select patterns:
https://reviews.llvm.org/rG430ad9697d14

That should make the fix less fragile, but the multi-use 'TODO' item is still a concern. Ie, we could fail to match because the source is complicated by extra uses and the existing use checks are too strict.

There's also a missing canonicalization:
https://alive2.llvm.org/ce/z/iNkTZ5

Need to check which direction on that leads to less breakage...