LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 19778 - asm label with the same name as static variable leads to: Assertion failed: (Symbol->isUndefined() && "Cannot define a symbol twice!")
Summary: asm label with the same name as static variable leads to: Assertion failed: (...
Status: NEW
Alias: None
Product: new-bugs
Classification: Unclassified
Component: new bugs (show other bugs)
Version: trunk
Hardware: All All
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
: 23541 27932 29169 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-05-18 06:18 PDT by Dimitry Andric
Modified: 2016-08-29 06:49 PDT (History)
8 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dimitry Andric 2014-05-18 06:18:27 PDT
I received a bugreport about the following assertion failure in clang, when building libglapi (a component of Mesa):

https://jenkins.freebsd.org/pci/head-i386/logs/bulk/headi386-default/753/logs/errors/libglapi-9.1.7.log

After some reducing, the testcase turns out to be very small:

__asm__("x86_entry_end:");
static x86_entry_end[];
entry_generate() {
  int *code_templ = x86_entry_end - 1;
  memcpy(0, code_templ, 32);
}

Compiling this with clang trunk r209086 (with assertions enabled), and any optimization level above -O1, gives:

Assertion failed: (Symbol->isUndefined() && "Cannot define a symbol twice!"), function EmitLabel, file /share/dim/src/llvm/trunk/lib/MC/MCAsmStreamer.cpp, line 309.
[... etc ...]

The code in question originates here in the Mesa repository:

http://cgit.freedesktop.org/mesa/mesa/tree/src/mapi/entry_x86_tls.h#n46

Now the question is: is this "legal" in any sense?  E.g. defining your own inline asm label, which has the same name as a static char array?  Mesa seems to have this construction since some time, so gcc apparently has no trouble with it.

In any case, I think clang should not assert on this.  Either reject the construction (but break Mesa), or deal with it. :)
Comment 1 Dimitry Andric 2014-06-08 16:41:49 PDT
Hi Rafael, do you have any opinion about this bug? :)
Comment 2 Rafael Ávila de Espíndola 2014-07-28 15:28:59 PDT
Given just

static int x86_entry_end[];
void f(int*);
void entry_generate() {
  f(x86_entry_end);
}

clang produces

@x86_entry_end = internal global [1 x i32] zeroinitializer, align 4


That is, it thinks x86_entry_end is a definition. It also prints the warning


test2.c:1:12: warning: tentative array definition assumed to have one element


gcc seems to think it is a declaration.

Changing the code to

static int x86_entry_end[1];

makes gcc output a definition too.
Comment 3 Richard Smith 2014-07-28 16:02:12 PDT
C11 6.9.2/2:

"A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a tentative definition. If a translation unit contains one or more tentative definitions for an identifier, and the translation unit contains no external definition for that identifier, then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0."

6.9.2/3:

"If the declaration of an identifier for an object is a tentative definition and has internal linkage, the declared type shall not be an incomplete type."


So the program has undefined behavior because it violates a 'shall' requirement in a 'semantics' rule. In -pedantic-errors mode at least, we should probably reject this.


As to whether we should support this particular code as a GNU extension... I don't know, it's fairly heinous. We don't in general guarantee the names of static entities, so this code seems broken regardless of whether we choose to support this particular case.

In any case, we shouldn't assert if the backend sees a symbol defined twice through such shenanigans.
Comment 4 Rafael Ávila de Espíndola 2014-07-28 16:29:56 PDT
A testcase for just the "don't crash" side of this bug would be

module asm "x86_entry_end:"
@x86_entry_end = constant i32 42
Comment 5 Dimitry Andric 2014-07-29 04:23:36 PDT
(In reply to comment #3)
...
> So the program has undefined behavior because it violates a 'shall'
> requirement in a 'semantics' rule. In -pedantic-errors mode at least, we
> should probably reject this.

I agree.


> As to whether we should support this particular code as a GNU extension... I
> don't know, it's fairly heinous. We don't in general guarantee the names of
> static entities, so this code seems broken regardless of whether we choose
> to support this particular case.

In this particular case, the desired construction is a generated jump
table of some sorts:

__asm__(
  ".text\n"
  ".balign 32\n"
  "x86_entry_start:");
[... generated stuff: ...]
__asm__(
  ".hidden shared_dispatch_stub_0\n"
  ".globl shared_dispatch_stub_0\n"
  ".type shared_dispatch_stub_0, @function\n"
  ".balign 32\n"
  "shared_dispatch_stub_0:\n"
  "  movl _glapi_Dispatch, %eax\n"
  "  testl %eax, %eax\n"
  "  je 1f\n"
  "  jmp *(4*0)(%eax)\n"
  "1:\n"
  "  call _glapi_get_dispatch\n"
  "  jmp *(4*0)(%eax)\n"
__asm__(
  ".hidden shared_dispatch_stub_1\n"
  ".globl shared_dispatch_stub_1\n"
  ".type shared_dispatch_stub_1, @function\n"
  ".balign 32\n"
  "shared_dispatch_stub_1:\n"
  "  movl _glapi_Dispatch, %eax\n"
  "  testl %eax, %eax\n"
  "  je 1f\n"
  "  jmp *(4*1)(%eax)\n"
  "1:\n"
  "  call _glapi_get_dispatch\n"
  "  jmp *(4*1)(%eax)\n"
[... much more of that ...]
__asm__(
  ".hidden shared_dispatch_stub_1025\n"
  ".globl shared_dispatch_stub_1025\n"
  ".type shared_dispatch_stub_1025, @function\n"
  ".balign 32\n"
  "shared_dispatch_stub_1025:\n"
  "  movl _glapi_Dispatch, %eax\n"
  "  testl %eax, %eax\n"
  "  je 1f\n"
  "  jmp *(4*1025)(%eax)\n"
  "1:\n"
  "  call _glapi_get_dispatch\n"
  "  jmp *(4*1025)(%eax)\n"
__asm__(
  ".balign 32\n"
  "x86_entry_end:");
[...]
static const char x86_entry_start[];
static const char x86_entry_end[];
[...]
mapi_func
entry_get_public(int slot)
{
  return (mapi_func) (x86_entry_start + slot * 32);
}
[...]
mapi_func
entry_generate(int slot)
{
  const char *code_templ = x86_entry_end - 32;
  void *code;
  mapi_func entry;

  code = u_execmem_alloc(32);
  if (!code)
    return ((void *)0);

  memcpy(code, code_templ, 32);
  entry = (mapi_func) code;
  entry_patch(entry, slot);

  return entry;
}

E.g. the intent is to get at the (aligned) start and end addresses of
the asm statements.  If there is any easier way to get at those, I can
propose it to the FreeBSD port maintainers and/or Mesa upstream.

For now, I changed the x86_entry_start/end items to extern declarations
instead:

extern const char x86_entry_start[];
extern const char x86_entry_end[];

That seems to have the desired effect, e.g. the functions that use the
symbols can access them, though the resulting assembly is slightly
different (it uses @GOT instead of @GOTOFF):

@@ -12340,7 +12340,7 @@
 	addl	$_GLOBAL_OFFSET_TABLE_, %ecx
 	movl	4(%esp), %eax
 	sall	$5, %eax
-	leal	x86_entry_start@GOTOFF(%ecx,%eax), %eax
+	addl	x86_entry_start@GOT(%ecx), %eax
 	ret
 	.cfi_endproc
 .LFE8:
@@ -12381,21 +12381,22 @@
 	testl	%eax, %eax
 	movl	%eax, %esi
 	je	.L7
-	movl	-32+x86_entry_end@GOTOFF(%ebx), %eax
-	movl	%eax, (%esi)
-	movl	-28+x86_entry_end@GOTOFF(%ebx), %eax
-	movl	%eax, 4(%esi)
-	movl	-24+x86_entry_end@GOTOFF(%ebx), %eax
-	movl	%eax, 8(%esi)
-	movl	-20+x86_entry_end@GOTOFF(%ebx), %eax
-	movl	%eax, 12(%esi)
-	movl	-16+x86_entry_end@GOTOFF(%ebx), %eax
-	movl	%eax, 16(%esi)
-	movl	-12+x86_entry_end@GOTOFF(%ebx), %eax
-	movl	%eax, 20(%esi)
-	movl	-8+x86_entry_end@GOTOFF(%ebx), %eax
-	movl	%eax, 24(%esi)
-	movl	-4+x86_entry_end@GOTOFF(%ebx), %eax
+	movl	x86_entry_end@GOT(%ebx), %eax
+	movl	-32(%eax), %edx
+	movl	%edx, (%esi)
+	movl	-28(%eax), %edx
+	movl	%edx, 4(%esi)
+	movl	-24(%eax), %edx
+	movl	%edx, 8(%esi)
+	movl	-20(%eax), %edx
+	movl	%edx, 12(%esi)
+	movl	-16(%eax), %edx
+	movl	%edx, 16(%esi)
+	movl	-12(%eax), %edx
+	movl	%edx, 20(%esi)
+	movl	-8(%eax), %edx
+	movl	-4(%eax), %eax
+	movl	%edx, 24(%esi)
 	movl	%eax, 28(%esi)
 	movl	32(%esp), %eax
 	movl	%esi, (%esp)

Now I only need some way to "declare" a static symbol, making it usable
from C functions... :)
Comment 6 Rafael Ávila de Espíndola 2014-07-29 10:55:51 PDT
> For now, I changed the x86_entry_start/end items to extern declarations
> instead:
> 
> extern const char x86_entry_start[];
> extern const char x86_entry_end[];
> 
> That seems to have the desired effect, e.g. the functions that use the
> symbols can access them, though the resulting assembly is slightly
> different (it uses @GOT instead of @GOTOFF):

Adding __attribute__((visibility("hidden"))) should give you back the GOTOFF access pattern.
 
> Now I only need some way to "declare" a static symbol, making it usable
> from C functions... :)

No idea if there is a general way to do that.
Comment 7 Dimitry Andric 2014-07-29 14:04:54 PDT
(In reply to comment #6)
> > For now, I changed the x86_entry_start/end items to extern declarations
> > instead:
> > 
> > extern const char x86_entry_start[];
> > extern const char x86_entry_end[];
> > 
> > That seems to have the desired effect, e.g. the functions that use the
> > symbols can access them, though the resulting assembly is slightly
> > different (it uses @GOT instead of @GOTOFF):
> 
> Adding __attribute__((visibility("hidden"))) should give you back the GOTOFF
> access pattern.

Yes, that works fine for both clang and gcc.  Thanks. :)
Comment 8 Dimitry Andric 2015-05-15 18:42:36 PDT
*** Bug 23541 has been marked as a duplicate of this bug. ***
Comment 9 Dimitry Andric 2016-05-30 06:13:40 PDT
*** Bug 27932 has been marked as a duplicate of this bug. ***
Comment 10 Reid Kleckner 2016-05-31 16:51:28 PDT
Is there any action for LLVM to take here? It sounds like mesa is making unreasonable assumptions about the compiler's generated code.

Also, this module level asm doesn't look like it will work with LLVM, given that we emit all module asm together before emitting functions. If that's the core issue, then we should dupe it against http://llvm.org/pr6364.
Comment 11 Richard Smith 2016-05-31 17:57:11 PDT
(In reply to comment #10)
> Is there any action for LLVM to take here?

Rafael's comment#4 example still causes llc to assert.
Comment 12 Dimitry Andric 2016-05-31 23:39:52 PDT
(In reply to comment #10)
> Is there any action for LLVM to take here? It sounds like mesa is making
> unreasonable assumptions about the compiler's generated code.

That's all very well, but LLVM *crashes*.  It should print instead "you cannot do this", point out why, and ideally, suggest a workaround.
Comment 13 Dimitry Andric 2016-08-29 05:57:27 PDT
*** Bug 29169 has been marked as a duplicate of this bug. ***