I received a bugreport about the following assertion failure in clang, when building libglapi (a component of Mesa): https://jenkins.freebsd.org/pci/head-i386/logs/bulk/headi386-default/753/logs/errors/libglapi-9.1.7.log After some reducing, the testcase turns out to be very small: __asm__("x86_entry_end:"); static x86_entry_end[]; entry_generate() { int *code_templ = x86_entry_end - 1; memcpy(0, code_templ, 32); } Compiling this with clang trunk r209086 (with assertions enabled), and any optimization level above -O1, gives: Assertion failed: (Symbol->isUndefined() && "Cannot define a symbol twice!"), function EmitLabel, file /share/dim/src/llvm/trunk/lib/MC/MCAsmStreamer.cpp, line 309. [... etc ...] The code in question originates here in the Mesa repository: http://cgit.freedesktop.org/mesa/mesa/tree/src/mapi/entry_x86_tls.h#n46 Now the question is: is this "legal" in any sense? E.g. defining your own inline asm label, which has the same name as a static char array? Mesa seems to have this construction since some time, so gcc apparently has no trouble with it. In any case, I think clang should not assert on this. Either reject the construction (but break Mesa), or deal with it. :)
Hi Rafael, do you have any opinion about this bug? :)
Given just static int x86_entry_end[]; void f(int*); void entry_generate() { f(x86_entry_end); } clang produces @x86_entry_end = internal global [1 x i32] zeroinitializer, align 4 That is, it thinks x86_entry_end is a definition. It also prints the warning test2.c:1:12: warning: tentative array definition assumed to have one element gcc seems to think it is a declaration. Changing the code to static int x86_entry_end[1]; makes gcc output a definition too.
C11 6.9.2/2: "A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a tentative definition. If a translation unit contains one or more tentative definitions for an identifier, and the translation unit contains no external definition for that identifier, then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0." 6.9.2/3: "If the declaration of an identifier for an object is a tentative definition and has internal linkage, the declared type shall not be an incomplete type." So the program has undefined behavior because it violates a 'shall' requirement in a 'semantics' rule. In -pedantic-errors mode at least, we should probably reject this. As to whether we should support this particular code as a GNU extension... I don't know, it's fairly heinous. We don't in general guarantee the names of static entities, so this code seems broken regardless of whether we choose to support this particular case. In any case, we shouldn't assert if the backend sees a symbol defined twice through such shenanigans.
A testcase for just the "don't crash" side of this bug would be module asm "x86_entry_end:" @x86_entry_end = constant i32 42
(In reply to comment #3) ... > So the program has undefined behavior because it violates a 'shall' > requirement in a 'semantics' rule. In -pedantic-errors mode at least, we > should probably reject this. I agree. > As to whether we should support this particular code as a GNU extension... I > don't know, it's fairly heinous. We don't in general guarantee the names of > static entities, so this code seems broken regardless of whether we choose > to support this particular case. In this particular case, the desired construction is a generated jump table of some sorts: __asm__( ".text\n" ".balign 32\n" "x86_entry_start:"); [... generated stuff: ...] __asm__( ".hidden shared_dispatch_stub_0\n" ".globl shared_dispatch_stub_0\n" ".type shared_dispatch_stub_0, @function\n" ".balign 32\n" "shared_dispatch_stub_0:\n" " movl _glapi_Dispatch, %eax\n" " testl %eax, %eax\n" " je 1f\n" " jmp *(4*0)(%eax)\n" "1:\n" " call _glapi_get_dispatch\n" " jmp *(4*0)(%eax)\n" __asm__( ".hidden shared_dispatch_stub_1\n" ".globl shared_dispatch_stub_1\n" ".type shared_dispatch_stub_1, @function\n" ".balign 32\n" "shared_dispatch_stub_1:\n" " movl _glapi_Dispatch, %eax\n" " testl %eax, %eax\n" " je 1f\n" " jmp *(4*1)(%eax)\n" "1:\n" " call _glapi_get_dispatch\n" " jmp *(4*1)(%eax)\n" [... much more of that ...] __asm__( ".hidden shared_dispatch_stub_1025\n" ".globl shared_dispatch_stub_1025\n" ".type shared_dispatch_stub_1025, @function\n" ".balign 32\n" "shared_dispatch_stub_1025:\n" " movl _glapi_Dispatch, %eax\n" " testl %eax, %eax\n" " je 1f\n" " jmp *(4*1025)(%eax)\n" "1:\n" " call _glapi_get_dispatch\n" " jmp *(4*1025)(%eax)\n" __asm__( ".balign 32\n" "x86_entry_end:"); [...] static const char x86_entry_start[]; static const char x86_entry_end[]; [...] mapi_func entry_get_public(int slot) { return (mapi_func) (x86_entry_start + slot * 32); } [...] mapi_func entry_generate(int slot) { const char *code_templ = x86_entry_end - 32; void *code; mapi_func entry; code = u_execmem_alloc(32); if (!code) return ((void *)0); memcpy(code, code_templ, 32); entry = (mapi_func) code; entry_patch(entry, slot); return entry; } E.g. the intent is to get at the (aligned) start and end addresses of the asm statements. If there is any easier way to get at those, I can propose it to the FreeBSD port maintainers and/or Mesa upstream. For now, I changed the x86_entry_start/end items to extern declarations instead: extern const char x86_entry_start[]; extern const char x86_entry_end[]; That seems to have the desired effect, e.g. the functions that use the symbols can access them, though the resulting assembly is slightly different (it uses @GOT instead of @GOTOFF): @@ -12340,7 +12340,7 @@ addl $_GLOBAL_OFFSET_TABLE_, %ecx movl 4(%esp), %eax sall $5, %eax - leal x86_entry_start@GOTOFF(%ecx,%eax), %eax + addl x86_entry_start@GOT(%ecx), %eax ret .cfi_endproc .LFE8: @@ -12381,21 +12381,22 @@ testl %eax, %eax movl %eax, %esi je .L7 - movl -32+x86_entry_end@GOTOFF(%ebx), %eax - movl %eax, (%esi) - movl -28+x86_entry_end@GOTOFF(%ebx), %eax - movl %eax, 4(%esi) - movl -24+x86_entry_end@GOTOFF(%ebx), %eax - movl %eax, 8(%esi) - movl -20+x86_entry_end@GOTOFF(%ebx), %eax - movl %eax, 12(%esi) - movl -16+x86_entry_end@GOTOFF(%ebx), %eax - movl %eax, 16(%esi) - movl -12+x86_entry_end@GOTOFF(%ebx), %eax - movl %eax, 20(%esi) - movl -8+x86_entry_end@GOTOFF(%ebx), %eax - movl %eax, 24(%esi) - movl -4+x86_entry_end@GOTOFF(%ebx), %eax + movl x86_entry_end@GOT(%ebx), %eax + movl -32(%eax), %edx + movl %edx, (%esi) + movl -28(%eax), %edx + movl %edx, 4(%esi) + movl -24(%eax), %edx + movl %edx, 8(%esi) + movl -20(%eax), %edx + movl %edx, 12(%esi) + movl -16(%eax), %edx + movl %edx, 16(%esi) + movl -12(%eax), %edx + movl %edx, 20(%esi) + movl -8(%eax), %edx + movl -4(%eax), %eax + movl %edx, 24(%esi) movl %eax, 28(%esi) movl 32(%esp), %eax movl %esi, (%esp) Now I only need some way to "declare" a static symbol, making it usable from C functions... :)
> For now, I changed the x86_entry_start/end items to extern declarations > instead: > > extern const char x86_entry_start[]; > extern const char x86_entry_end[]; > > That seems to have the desired effect, e.g. the functions that use the > symbols can access them, though the resulting assembly is slightly > different (it uses @GOT instead of @GOTOFF): Adding __attribute__((visibility("hidden"))) should give you back the GOTOFF access pattern. > Now I only need some way to "declare" a static symbol, making it usable > from C functions... :) No idea if there is a general way to do that.
(In reply to comment #6) > > For now, I changed the x86_entry_start/end items to extern declarations > > instead: > > > > extern const char x86_entry_start[]; > > extern const char x86_entry_end[]; > > > > That seems to have the desired effect, e.g. the functions that use the > > symbols can access them, though the resulting assembly is slightly > > different (it uses @GOT instead of @GOTOFF): > > Adding __attribute__((visibility("hidden"))) should give you back the GOTOFF > access pattern. Yes, that works fine for both clang and gcc. Thanks. :)
*** Bug 23541 has been marked as a duplicate of this bug. ***
*** Bug 27932 has been marked as a duplicate of this bug. ***
Is there any action for LLVM to take here? It sounds like mesa is making unreasonable assumptions about the compiler's generated code. Also, this module level asm doesn't look like it will work with LLVM, given that we emit all module asm together before emitting functions. If that's the core issue, then we should dupe it against http://llvm.org/pr6364.
(In reply to comment #10) > Is there any action for LLVM to take here? Rafael's comment#4 example still causes llc to assert.
(In reply to comment #10) > Is there any action for LLVM to take here? It sounds like mesa is making > unreasonable assumptions about the compiler's generated code. That's all very well, but LLVM *crashes*. It should print instead "you cannot do this", point out why, and ideally, suggest a workaround.
*** Bug 29169 has been marked as a duplicate of this bug. ***