I found this trying to make a better implementation of Rust's `Ord::cmp` for integers. C++ repro https://godbolt.org/z/tL0-oW ``` int spaceship(int a, int b) { return (a > b) - (a < b); } bool my_lt(int a, int b) { return spaceship(a, b) == -1; } ``` `my_lt` there should be foldable down to just `a < b`, but that doesn't happen ``` define dso_local zeroext i1 @_Z5my_ltii(i32 %0, i32 %1) local_unnamed_addr #0 { %3 = icmp sgt i32 %0, %1 %4 = zext i1 %3 to i32 %5 = icmp slt i32 %0, %1 %6 = zext i1 %5 to i32 %7 = sub nsw i32 %4, %6 %8 = icmp eq i32 %7, -1 ret i1 %8 } ``` (And running it through `opt` again doesn't help either, https://godbolt.org/z/-hl5H0)
If we add this canonicalization to use 'select', existing transforms should find the simplification: %gt = icmp sgt i32 %x, %y %zgt = zext i1 %gt to i32 %lt = icmp slt i32 %x, %y %zlt = zext i1 %lt to i32 %d = sub nsw i32 %zgt, %zlt => %d = select i1 %lt, i32 -1, i32 %zgt https://rise4fun.com/Alive/pQN
Note that that canonicalization would undo what I was trying in the first place, which is this difference https://godbolt.org/z/to7D8q ``` example::spaceship1: cmp edi, esi seta al sbb al, 0 ret example::spaceship2: xor ecx, ecx cmp edi, esi seta cl mov eax, 255 cmovae eax, ecx ret ``` It's not obvious to me which of those is better in general (is byte sbb or cmov worse?), but llvm-mca does say the former is better on the (old) core2 CPU I'm currently running.