LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 50389 - Missed optimization for A && !B
Summary: Missed optimization for A && !B
Status: NEW
Alias: None
Product: libraries
Classification: Unclassified
Component: Common Code Generator Code (show other bugs)
Version: trunk
Hardware: All All
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-05-18 09:43 PDT by Martijn Vels
Modified: 2021-06-18 08:32 PDT (History)
10 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martijn Vels 2021-05-18 09:43:03 PDT
The ABI dictates that bool be strictly 0 or 1. A bool value outside of 0 or 1 is UB

The compiler can optimize `A && !B1 then as `A > B`, which say under x86 becomes the macro fused op:

AAndNotBAlt(bool, bool):
        cmp     dil, sil
        jbe     .LBB1_1

instead, the compiler generates:
https://gcc.godbolt.org/z/EWx63b4nz

AAndNotB(bool, bool):
        test    edi, edi
        je      .LBB0_2
        test    sil, sil
        jne     .LBB0_2
        jmp     True()
Comment 1 Datta Nagraj Pathikonda 2021-06-11 08:55:59 PDT
@spatel Can we do this opt as below in the instCombine pass using the match patterns:

https://alive2.llvm.org/ce/z/dytU8_
Comment 2 Datta Nagraj Pathikonda 2021-06-11 09:08:13 PDT
But, I see that instcombine itself converts icmp to XOR and AND insts. Not sure if instcombine is the correct place for this opt:

https://godbolt.org/z/xaPdeK61d
Comment 3 Sanjay Patel 2021-06-13 06:28:41 PDT
As noted, InstCombine currently transforms icmp of bools to logic ops. 

I don't know the history of that decision, but it seems like a reasonable choice for IR because it is more natural to analyze boolean logic rather than comparisons of bools. This is despite the fact that the logic sequence may be 2 instructions instead of 1 compare.

If this is better inverted in x86 asm, then you have backend options:
1. Show that the icmp form is better for all targets and add a transform to DAGCombiner.
2. Show that the icmp form is better for >1 targets and add a transform using a TLI hook in DAGCombiner.
3. Show that the transform is always good for x86 and add a target-specific transform in X86ISelLowering.

There may be some complications here to deal with branches on merged conditions (see the TLI hook isJumpExpensive()).

The godbolt link in the description shows a different possible missing InstCombine transform:
https://alive2.llvm.org/ce/z/gsT3bD

...or that might be 2 different transforms. Be aware of poison/undef behavior to make sure that is preserved.