When multiple alternatives in an inline asm constraint are given we ignore all of them but the most "general". This gives nasty artifacts in the code. int bsr(unsigned v) { int ret; __asm__("bsr %1, %0" : "=&r"(ret) : "rm"(v) : "cc"); return ret; } $ clang -O3 -S -o - t.c bsr: movl %edi, -4(%rsp) #APP bsrl -4(%rsp), %eax #NO_APP retq The spilling is totally unnecessary. GCC gets this one right. On 32 bit x86 it's even worse: $ clang -O3 -S -o - t.c -m32 bsr: pushl %eax movl 8(%esp), %eax movl %eax, (%esp) #APP bsrl (%esp), %eax #NO_APP popl %edx retl GCC knows a better way: $ gcc-4.8 -O3 -S -o - t.c -m32 bsr: #APP bsr 4(%esp), %eax #NO_APP ret The constraint "g" is just as bad, being translated into "imr" internally.
Agreed.
*** Bug 35489 has been marked as a duplicate of this bug. ***
Would not it be better to at least treat 'mr' as 'r' rather than 'm'? This will probably yield better code in many cases.
*** Bug 37583 has been marked as a duplicate of this bug. ***
*** Bug 31525 has been marked as a duplicate of this bug. ***
(In reply to Ruslan Nikolaev from comment #3) > Would not it be better to at least treat 'mr' as 'r' rather than 'm'? This > will probably yield better code in many cases. "mr" isn't being treated as "m". Consider this C function: void f(int x) { asm volatile("# x is in %0" :: "mr"(x)); } With "clang -O3 -m32", it compiles into this mess: f: pushl %eax movl 8(%esp), %eax movl %eax, (%esp) # x is in (%esp) popl %eax retl But if I use "m" instead of "mr", then it compiles into what I wanted: f: # x is in 4(%esp) retl So the presence of "r" is somehow making the codegen worse even though it's putting the value in memory anyway.
*** Bug 47530 has been marked as a duplicate of this bug. ***
[Copy-n-paste of my harebrained idea here] The major issue with supporting multiple constraints is how we model those constraints until register allocation is complete. (Thank you, Capt. Obvious!) The decision of which constraint to use is made during DAG selection. So there's no (easy) way to change this during register allocation. Half-baked idea (all very hand-wavy): Different constraints use different set ups (pre-asm code) and tear downs (post-asm code) for the inline asm. What if we created pseudo-instructions to represent inline asm set up and tear down? Something like this: INLINEASM_SETUP <representing register/memory setup> INLINEASM <...> INLINEASM_TEARDOWN <representing copy of results into vregs/pregs> The register allocator could then try different constraints (going from most restrictive to least restrictive) until it find one that works. One drawback is that the RA needs to process INLINEASM before it can generate the correct code for INLINEASM_SETUP. That might be doable if the three instructions are treated as a single unit.
*** Bug 49406 has been marked as a duplicate of this bug. ***