Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed optimization for A && !B #49733

Open
llvmbot opened this issue May 18, 2021 · 3 comments
Open

Missed optimization for A && !B #49733

llvmbot opened this issue May 18, 2021 · 3 comments
Labels
bugzilla Issues migrated from bugzilla llvm:codegen

Comments

@llvmbot
Copy link
Collaborator

llvmbot commented May 18, 2021

Bugzilla Link 50389
Version trunk
OS All
Reporter LLVM Bugzilla Contributor
CC @davidbolvansky,@DougGregor,@LebedevRI,@RKSimon,@zygoloid,@rotateright

Extended Description

The ABI dictates that bool be strictly 0 or 1. A bool value outside of 0 or 1 is UB

The compiler can optimize A && !B1 then as A > B`, which say under x86 becomes the macro fused op:

AAndNotBAlt(bool, bool):
cmp dil, sil
jbe .LBB1_1

instead, the compiler generates:
https://gcc.godbolt.org/z/EWx63b4nz

AAndNotB(bool, bool):
test edi, edi
je .LBB0_2
test sil, sil
jne .LBB0_2
jmp True()

@llvmbot
Copy link
Collaborator Author

llvmbot commented Jun 11, 2021

@​spatel Can we do this opt as below in the instCombine pass using the match patterns:

https://alive2.llvm.org/ce/z/dytU8_

@llvmbot
Copy link
Collaborator Author

llvmbot commented Jun 11, 2021

But, I see that instcombine itself converts icmp to XOR and AND insts. Not sure if instcombine is the correct place for this opt:

https://godbolt.org/z/xaPdeK61d

@rotateright
Copy link
Contributor

As noted, InstCombine currently transforms icmp of bools to logic ops.

I don't know the history of that decision, but it seems like a reasonable choice for IR because it is more natural to analyze boolean logic rather than comparisons of bools. This is despite the fact that the logic sequence may be 2 instructions instead of 1 compare.

If this is better inverted in x86 asm, then you have backend options:

  1. Show that the icmp form is better for all targets and add a transform to DAGCombiner.
  2. Show that the icmp form is better for >1 targets and add a transform using a TLI hook in DAGCombiner.
  3. Show that the transform is always good for x86 and add a target-specific transform in X86ISelLowering.

There may be some complications here to deal with branches on merged conditions (see the TLI hook isJumpExpensive()).

The godbolt link in the description shows a different possible missing InstCombine transform:
https://alive2.llvm.org/ce/z/gsT3bD

...or that might be 2 different transforms. Be aware of poison/undef behavior to make sure that is preserved.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla llvm:codegen
Projects
None yet
Development

No branches or pull requests

2 participants