50096 – __builtin_popcount(~a) & 1 - > __builtin_popcount(a) & 1

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 50096 - __builtin_popcount(~a) & 1 - > __builtin_popcount(a) & 1

Summary: __builtin_popcount(~a) & 1 - > __builtin_popcount(a) & 1

Status:	RESOLVED FIXED

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Scalar Optimizations (show other bugs)
Version:	trunk
Hardware:	PC Linux

Importance:	P enhancement
Assignee:	Unassigned LLVM Bugs

URL:
Keywords:

Depends on:
Blocks:

Reported:	2021-04-23 06:16 PDT by David Bolvansky
Modified:	2021-04-23 10:52 PDT (History)
CC List:	3 users (show)

See Also:
Fixed By Commit(s):	rGe10d7d455d4e

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description David Bolvansky 2021-04-23 06:16:23 PDT

----------------------------------------
define i32 @src(i32 %0) noread nowrite nofree {
%1:
  %2 = xor i32 %0, 4294967295
  %3 = ctpop i32 %2
  %4 = and i32 %3, 1
  ret i32 %4
}
=>
define i32 @tgt(i32 %0) noread nowrite nofree {
%1:
  %2 = ctpop i32 %0
  %3 = and i32 %2, 1
  ret i32 %3
}
Transformation seems to be correct!

https://godbolt.org/z/hGoxxYYd6

Comment 1 David Bolvansky 2021-04-23 06:43:37 PDT

Hm...

If we have this transformation:

----------------------------------------
define i8 @src(i8 %0) {
%1:
  %2 = xor i8 %0, 255
  %3 = ctpop i8 %2
  ret i8 %3
}
=>
define i8 @tgt(i8 %0) {
%1:
  %2 = ctpop i8 %0
  %3 = sub i8 8, %2
  ret i8 %3
}
Transformation seems to be correct!

Then

int tgt(unsigned int a)
{
    return (32 -__builtin_popcount(a))  & 1;
}


is nicely optimised to

tgt(unsigned int):                                # @tgt(unsigned int)
        popcnt  eax, edi
        and     eax, 1
        ret


@spatel, replace xor with sub, WDYT?

Comment 2 David Bolvansky 2021-04-23 06:45:54 PDT

int A(unsigned int a)
{
    return (32 -__builtin_popcount(a));
}




int B(unsigned int a)
{
    return (__builtin_popcount(~a)) ;
}

A(unsigned int):                                  # @A(unsigned int)
        popcnt  ecx, edi
        mov     eax, 32
        sub     eax, ecx
        ret
B(unsigned int):                                  # @B(unsigned int)
        not     edi
        popcnt  eax, edi
        ret

Comment 3 Sanjay Patel 2021-04-23 08:36:27 PDT

(In reply to David Bolvansky from comment #1)
> int tgt(unsigned int a)
> {
>     return (32 -__builtin_popcount(a))  & 1;
> }
> 
> 
> is nicely optimised to
> 
> tgt(unsigned int):                                # @tgt(unsigned int)
>         popcnt  eax, edi
>         and     eax, 1
>         ret
> @spatel, replace xor with sub, WDYT?

We really should have this folding to "not" in IR:
https://alive2.llvm.org/ce/z/uuo8xa

...because a 'not' is better for analysis than a 'sub'.

If that causes a regression on the parity pattern, we should add another fold for that. I'll take a look.

Comment 4 Sanjay Patel 2021-04-23 10:41:24 PDT

https://reviews.llvm.org/rGe10d7d455d4e

Please check if that solves the motivating cases.

We probably still want the sub->not transform, but if it is not showing up, it may not be that important - can either keep this open to track that or open another report.

Comment 5 David Bolvansky 2021-04-23 10:50:00 PDT

Thanks.


I will close this PR and create new one.