20197 – Inline asm constraint alternatives ignored

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 20197 - Inline asm constraint alternatives ignored

Summary: Inline asm constraint alternatives ignored

Status:	NEW

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Common Code Generator Code (show other bugs)
Version:	trunk
Hardware:	PC Linux

Importance:	P normal
Assignee:	Unassigned LLVM Bugs

URL:
Keywords:

Duplicates (5):	31525 35489 37583 47530 49406 (view as bug list)
Depends on:
Blocks:	4068
	Show dependency tree

Reported:	2014-07-03 09:40 PDT by Benjamin Kramer
Modified:	2021-08-16 16:16 PDT (History)
CC List:	11 users (show)

See Also:
Fixed By Commit(s):

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Benjamin Kramer 2014-07-03 09:40:05 PDT

When multiple alternatives in an inline asm constraint are given we ignore all of them but the most "general". This gives nasty artifacts in the code.

int bsr(unsigned v) {
  int ret;
  __asm__("bsr %1, %0" : "=&r"(ret) : "rm"(v) : "cc");
  return ret;
}

$ clang -O3 -S -o - t.c
bsr:
	movl	%edi, -4(%rsp)
	#APP
	bsrl	-4(%rsp), %eax
	#NO_APP
	retq

The spilling is totally unnecessary. GCC gets this one right. On 32 bit x86 it's even worse:

$ clang -O3 -S -o - t.c -m32
bsr:
	pushl	%eax
	movl	8(%esp), %eax
	movl	%eax, (%esp)
	#APP
	bsrl	(%esp), %eax
	#NO_APP
	popl	%edx
	retl

GCC knows a better way:
$ gcc-4.8 -O3 -S -o - t.c -m32
bsr:
#APP
	bsr 4(%esp), %eax
#NO_APP
	ret

The constraint "g" is just as bad, being translated into "imr" internally.

Comment 1 Eric Christopher 2014-07-03 12:36:10 PDT

Agreed.

Comment 2 Nirav Dave 2018-03-20 08:52:24 PDT

*** Bug 35489 has been marked as a duplicate of this bug. ***

Comment 3 Ruslan Nikolaev 2018-03-20 10:41:58 PDT

Would not it be better to at least treat 'mr' as 'r' rather than 'm'? This will probably yield better code in many cases.

Comment 4 Nirav Dave 2018-06-01 07:33:02 PDT

*** Bug 37583 has been marked as a duplicate of this bug. ***

Comment 5 Nirav Dave 2018-06-01 07:40:11 PDT

*** Bug 31525 has been marked as a duplicate of this bug. ***

Comment 6 Joseph C. Sible 2020-06-06 11:08:35 PDT

(In reply to Ruslan Nikolaev from comment #3)
> Would not it be better to at least treat 'mr' as 'r' rather than 'm'? This
> will probably yield better code in many cases.

"mr" isn't being treated as "m". Consider this C function:

void f(int x) {
    asm volatile("# x is in %0" :: "mr"(x));
}

With "clang -O3 -m32", it compiles into this mess:

f:
        pushl   %eax
        movl    8(%esp), %eax
        movl    %eax, (%esp)
        # x is in (%esp)
        popl    %eax
        retl

But if I use "m" instead of "mr", then it compiles into what I wanted:

f:
        # x is in 4(%esp)
        retl

So the presence of "r" is somehow making the codegen worse even though it's putting the value in memory anyway.

Comment 7 Eli Friedman 2020-09-15 15:13:26 PDT

*** Bug 47530 has been marked as a duplicate of this bug. ***

Comment 8 Bill Wendling 2020-09-15 15:41:41 PDT

[Copy-n-paste of my harebrained idea here]

The major issue with supporting multiple constraints is how we model those constraints until register allocation is complete. (Thank you, Capt. Obvious!) The decision of which constraint to use is made during DAG selection. So there's no (easy) way to change this during register allocation.

Half-baked idea (all very hand-wavy):

Different constraints use different set ups (pre-asm code) and tear downs (post-asm code) for the inline asm. What if we created pseudo-instructions to represent inline asm set up and tear down? Something like this:

  INLINEASM_SETUP <representing register/memory setup>
  INLINEASM <...>
  INLINEASM_TEARDOWN <representing copy of results into vregs/pregs>

The register allocator could then try different constraints (going from most restrictive to least restrictive) until it find one that works.

One drawback is that the RA needs to process INLINEASM before it can generate the correct code for INLINEASM_SETUP. That might be doable if the three instructions are treated as a single unit.

Comment 9 Eli Friedman 2021-03-03 15:02:16 PST

*** Bug 49406 has been marked as a duplicate of this bug. ***