Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert shifts and logic to popcount #51434

Open
rotateright opened this issue Oct 6, 2021 · 1 comment
Open

convert shifts and logic to popcount #51434

rotateright opened this issue Oct 6, 2021 · 1 comment
Labels
bugzilla Issues migrated from bugzilla

Comments

@rotateright
Copy link
Contributor

Bugzilla Link 52092
Version trunk
OS All
CC @bjope,@RKSimon

Extended Description

define i32 @​src(i32 %0) {
%2 = lshr i32 %0, 1
%3 = xor i32 %2, %0
%4 = lshr i32 %0, 2
%5 = xor i32 %3, %4
%6 = lshr i32 %0, 3
%7 = xor i32 %5, %6
%8 = and i32 %7, 1
ret i32 %8
}

define i32 @​tgt(i32 %0) {
%2 = and i32 %0, 15
%3 = tail call i32 @​llvm.ctpop.i32(i32 %2)
%4 = and i32 %3, 1
ret i32 %4
}

declare i32 @​llvm.ctpop.i32(i32)

https://alive2.llvm.org/ce/z/bWgS_h

This example is derived from the post-commit discussion in:
https://reviews.llvm.org/D110170

The number of masked bits is not limited to 4 specifically, and those bits don't have to be sequential. There's likely some related transforms for 'and' or 'or' rather than 'xor'.

This may not be a win in codegen for a target that doesn't support popcount, so backend work may be needed if this is implemented in IR.

Since the pattern is not limited to a fixed number of bits/ops, this could be tried in AggressiveInstCombine to avoid compile-time concerns.

@rotateright
Copy link
Contributor Author

Here's an x86 example of codegen:
https://godbolt.org/z/sP6n13xd3

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

1 participant