Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM should better understand subtraction of zext i1s #42543

Open
llvmbot opened this issue Sep 2, 2019 · 2 comments
Open

LLVM should better understand subtraction of zext i1s #42543

llvmbot opened this issue Sep 2, 2019 · 2 comments
Labels
bugzilla Issues migrated from bugzilla

Comments

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 2, 2019

Bugzilla Link 43198
Version trunk
OS Windows NT
Reporter LLVM Bugzilla Contributor
CC @RKSimon,@rotateright

Extended Description

I found this trying to make a better implementation of Rust's Ord::cmp for integers.

C++ repro https://godbolt.org/z/tL0-oW

int spaceship(int a, int b) {
    return (a > b) - (a < b);
}

bool my_lt(int a, int b) {
    return spaceship(a, b) == -1;
}

my_lt there should be foldable down to just a < b, but that doesn't happen

define dso_local zeroext i1 @&#8203;_Z5my_ltii(i32 %0, i32 %1) local_unnamed_addr #&#8203;0 {
  %3 = icmp sgt i32 %0, %1
  %4 = zext i1 %3 to i32
  %5 = icmp slt i32 %0, %1
  %6 = zext i1 %5 to i32
  %7 = sub nsw i32 %4, %6
  %8 = icmp eq i32 %7, -1
  ret i1 %8
}

(And running it through opt again doesn't help either, https://godbolt.org/z/-hl5H0)

@rotateright
Copy link
Contributor

If we add this canonicalization to use 'select', existing transforms should find the simplification:

%gt = icmp sgt i32 %x, %y
%zgt = zext i1 %gt to i32
%lt = icmp slt i32 %x, %y
%zlt = zext i1 %lt to i32
%d = sub nsw i32 %zgt, %zlt
=>
%d = select i1 %lt, i32 -1, i32 %zgt

https://rise4fun.com/Alive/pQN

@llvmbot
Copy link
Collaborator Author

llvmbot commented Sep 4, 2019

Note that that canonicalization would undo what I was trying in the first place, which is this difference https://godbolt.org/z/to7D8q

example::spaceship1:
        cmp     edi, esi
        seta    al
        sbb     al, 0
        ret

example::spaceship2:
        xor     ecx, ecx
        cmp     edi, esi
        seta    cl
        mov     eax, 255
        cmovae  eax, ecx
        ret

It's not obvious to me which of those is better in general (is byte sbb or cmov worse?), but llvm-mca does say the former is better on the (old) core2 CPU I'm currently running.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

2 participants