New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clang asm chooses poorly for "=rm" #46874
Comments
I think we just blindly give priority to the memory contraint over the register constraint. |
The major issue with supporting multiple constraints is how we model those constraints until register allocation is complete. (Thank you, Capt. Obvious!) The decision of which constraint to use is made during DAG selection. So there's no (easy) way to change this during register allocation. Half-baked idea (all very hand-wavy): Different constraints use different set ups (pre-asm code) and tear downs (post-asm code) for the inline asm. What if we created pseudo-instructions to represent inline asm set up and tear down? Something like this: INLINEASM_SETUP <representing register/memory setup> The register allocator could then try different constraints (going from most restrictive to least restrictive) until it find one that works. One drawback is that the RA needs to process INLINEASM before it can generate the correct code for INLINEASM_SETUP. That might be doable if the three instructions are treated as a single unit. |
*** This bug has been marked as a duplicate of bug #20571 *** |
If we just care about "rm", specifically, there's a much simpler solution: we emit the operand as a register, but add flag to indicate it can be rewritten to a memory operand. |
It might be fine to have that as the first step, but I think there are other unoptimized uses to consider, e.g.: int test(int n, int idx) { Though this involves teaching the reg alloc. about addressing modes. |
mentioned in issue llvm/llvm-bugzilla-archive#47531 |
this was duplicated to: #20571 |
Extended Description
With this trivial input:
unsigned long mov_zero(void)
{
unsigned long ret;
asm ("movq $0, %0" : "=rm" (ret));
return ret;
}
clang -O2 generates this very suboptimal output:
mov_zero:
movq $0x0,-0x8(%rsp)
mov -0x8(%rsp),%rax
ret
gcc does much better:
mov_zero:
mov $0x0,%rax
retq
The text was updated successfully, but these errors were encountered: