---------------------------------------- define i32 @src(i32 %0) noread nowrite nofree { %1: %2 = xor i32 %0, 4294967295 %3 = ctpop i32 %2 %4 = and i32 %3, 1 ret i32 %4 } => define i32 @tgt(i32 %0) noread nowrite nofree { %1: %2 = ctpop i32 %0 %3 = and i32 %2, 1 ret i32 %3 } Transformation seems to be correct! https://godbolt.org/z/hGoxxYYd6
Hm... If we have this transformation: ---------------------------------------- define i8 @src(i8 %0) { %1: %2 = xor i8 %0, 255 %3 = ctpop i8 %2 ret i8 %3 } => define i8 @tgt(i8 %0) { %1: %2 = ctpop i8 %0 %3 = sub i8 8, %2 ret i8 %3 } Transformation seems to be correct! Then int tgt(unsigned int a) { return (32 -__builtin_popcount(a)) & 1; } is nicely optimised to tgt(unsigned int): # @tgt(unsigned int) popcnt eax, edi and eax, 1 ret @spatel, replace xor with sub, WDYT?
int A(unsigned int a) { return (32 -__builtin_popcount(a)); } int B(unsigned int a) { return (__builtin_popcount(~a)) ; } A(unsigned int): # @A(unsigned int) popcnt ecx, edi mov eax, 32 sub eax, ecx ret B(unsigned int): # @B(unsigned int) not edi popcnt eax, edi ret
(In reply to David Bolvansky from comment #1) > int tgt(unsigned int a) > { > return (32 -__builtin_popcount(a)) & 1; > } > > > is nicely optimised to > > tgt(unsigned int): # @tgt(unsigned int) > popcnt eax, edi > and eax, 1 > ret > @spatel, replace xor with sub, WDYT? We really should have this folding to "not" in IR: https://alive2.llvm.org/ce/z/uuo8xa ...because a 'not' is better for analysis than a 'sub'. If that causes a regression on the parity pattern, we should add another fold for that. I'll take a look.
https://reviews.llvm.org/rGe10d7d455d4e Please check if that solves the motivating cases. We probably still want the sub->not transform, but if it is not showing up, it may not be that important - can either keep this open to track that or open another report.
Thanks. I will close this PR and create new one.