LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 52092 - convert shifts and logic to popcount
Summary: convert shifts and logic to popcount
Status: NEW
Alias: None
Product: libraries
Classification: Unclassified
Component: Scalar Optimizations (show other bugs)
Version: trunk
Hardware: PC All
: P enhancement
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-10-06 09:50 PDT by Sanjay Patel
Modified: 2021-10-13 13:30 PDT (History)
3 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sanjay Patel 2021-10-06 09:50:10 PDT
define i32 @src(i32 %0) {
  %2 = lshr i32 %0, 1
  %3 = xor i32 %2, %0
  %4 = lshr i32 %0, 2
  %5 = xor i32 %3, %4
  %6 = lshr i32 %0, 3
  %7 = xor i32 %5, %6
  %8 = and i32 %7, 1
  ret i32 %8
}

define i32 @tgt(i32 %0) {
  %2 = and i32 %0, 15
  %3 = tail call i32 @llvm.ctpop.i32(i32 %2)
  %4 = and i32 %3, 1
  ret i32 %4
}

declare i32 @llvm.ctpop.i32(i32)

https://alive2.llvm.org/ce/z/bWgS_h

This example is derived from the post-commit discussion in:
https://reviews.llvm.org/D110170

The number of masked bits is not limited to 4 specifically, and those bits don't have to be sequential. There's likely some related transforms for 'and' or 'or' rather than 'xor'.

This may not be a win in codegen for a target that doesn't support popcount, so backend work may be needed if this is implemented in IR. 

Since the pattern is not limited to a fixed number of bits/ops, this could be tried in AggressiveInstCombine to avoid compile-time concerns.
Comment 1 Sanjay Patel 2021-10-06 09:53:23 PDT
Here's an x86 example of codegen:
https://godbolt.org/z/sP6n13xd3