clang asm chooses poorly for "=rm" #46874

llvmbot · 2020-09-14T21:55:11Z


Bugzilla Link	47530
Resolution	DUPLICATE
Resolved on	Sep 15, 2020 16:20
Version	trunk
OS	Linux
Blocks	#4440
Reporter	LLVM Bugzilla Contributor
CC	@topperc,@echristo,@efriedma-quic,@isanbard,@jyknight,@nickdesaulniers,@zygoloid

Extended Description

With this trivial input:

unsigned long mov_zero(void)
{
unsigned long ret;
asm ("movq $0, %0" : "=rm" (ret));
return ret;
}

clang -O2 generates this very suboptimal output:

mov_zero:
movq $0x0,-0x8(%rsp)
mov -0x8(%rsp),%rax
ret

gcc does much better:

mov_zero:
mov $0x0,%rax
retq

topperc · 2020-09-14T22:13:20Z

I think we just blindly give priority to the memory contraint over the register constraint.

isanbard · 2020-09-15T22:07:13Z

The major issue with supporting multiple constraints is how we model those constraints until register allocation is complete. (Thank you, Capt. Obvious!) The decision of which constraint to use is made during DAG selection. So there's no (easy) way to change this during register allocation.

Half-baked idea (all very hand-wavy):

Different constraints use different set ups (pre-asm code) and tear downs (post-asm code) for the inline asm. What if we created pseudo-instructions to represent inline asm set up and tear down? Something like this:

INLINEASM_SETUP <representing register/memory setup>
INLINEASM <...>
INLINEASM_TEARDOWN <representing copy of results into vregs/pregs>

The register allocator could then try different constraints (going from most restrictive to least restrictive) until it find one that works.

One drawback is that the RA needs to process INLINEASM before it can generate the correct code for INLINEASM_SETUP. That might be doable if the three instructions are treated as a single unit.

efriedma-quic · 2020-09-15T22:13:26Z

*** This bug has been marked as a duplicate of bug #20571 ***

efriedma-quic · 2020-09-15T23:15:39Z

INLINEASM_SETUP <representing register/memory setup>
INLINEASM <...>
INLINEASM_TEARDOWN <representing copy of results into vregs/pregs>

If we just care about "rm", specifically, there's a much simpler solution: we emit the operand as a register, but add flag to indicate it can be rewritten to a memory operand.

isanbard · 2020-09-15T23:20:46Z

It might be fine to have that as the first step, but I think there are other unoptimized uses to consider, e.g.:

int test(int n, int idx) {
asm("add %1, %0"
: "=ro"(idx)
: "ro"(n)
: "cc");
return idx;
}

Though this involves teaching the reg alloc. about addressing modes.

nickdesaulniers · 2021-11-27T03:38:08Z

mentioned in issue llvm/llvm-bugzilla-archive#47531

nickdesaulniers · 2021-12-16T19:40:26Z

this was duplicated to: #20571

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021

jyknight mentioned this issue Mar 3, 2022

inline asm "rm" constraint lowered "m" when "r" would be preferable #20571

Open

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clang asm chooses poorly for "=rm" #46874

clang asm chooses poorly for "=rm" #46874

llvmbot commented Sep 14, 2020

topperc commented Sep 14, 2020

isanbard commented Sep 15, 2020

efriedma-quic commented Sep 15, 2020

efriedma-quic commented Sep 15, 2020

isanbard commented Sep 15, 2020

nickdesaulniers commented Nov 27, 2021

nickdesaulniers commented Dec 16, 2021

clang asm chooses poorly for "=rm" #46874

clang asm chooses poorly for "=rm" #46874

Comments

llvmbot commented Sep 14, 2020

Extended Description

topperc commented Sep 14, 2020

isanbard commented Sep 15, 2020

efriedma-quic commented Sep 15, 2020

efriedma-quic commented Sep 15, 2020

isanbard commented Sep 15, 2020

nickdesaulniers commented Nov 27, 2021

nickdesaulniers commented Dec 16, 2021