LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 46970 - LLD does not preserve symbol types for defsym
Summary: LLD does not preserve symbol types for defsym
Status: RESOLVED FIXED
Alias: None
Product: lld
Classification: Unclassified
Component: ELF (show other bugs)
Version: unspecified
Hardware: PC All
: P enhancement
Assignee: Fangrui Song
URL:
Keywords:
Depends on:
Blocks: 4068
  Show dependency tree
 
Reported: 2020-08-03 12:10 PDT by Shoaib Meenai
Modified: 2020-11-16 13:01 PST (History)
7 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Shoaib Meenai 2020-08-03 12:10:18 PDT
$ cat f.c
void f() {}

$ clang -c f.c
$ ld.lld -shared --defsym=g=f f.o
$ objdump -T a.out
DYNAMIC SYMBOL TABLE:
00000000000012a0 g    DF .text  0000000000000006 f
00000000000012a0 g    D  .text  0000000000000000 g

f is marked STT_FUNC, but g is not. It might be hard to do in the general case (where e.g. you can have arithmetic in the defsym expression), but in this case, it seems desirable for the alias to get the same symbol type as the original.

From Peter Smith:

There can be a problem on Arm as no interworking will be performed for symbols that are not STT_FUNC.

$ cat h.c
extern void f();
extern void g();
void h() { f(); g(); }

$ clang --target=armv7a-none-eabi -c f.c
$ clang --target=armv7a-none-eabi -c h.c -mthumb
$ ld.lld f.o h.o --defsym g=f  # No --shared to prevent a PLT entry.
$ objdump -d a.out
000200e8 <h>:
   200e8:       b580            push    {r7, lr}
   200ea:       466f            mov     r7, sp
   200ec:       f7ff effa       blx     200e4 <f>
   200f0:       f7ff fff8       bl      200e4 <f>
   200f4:       bd80            pop     {r7, pc}

The blx to f() is correct as a state change is required. The bl to f() will likely crash the program.

ld.bfd correctly marks g as STT_FUNC so it gets the state change correct for both calls.
00008004 <h>:
    8004:       b580            push    {r7, lr}
    8006:       466f            mov     r7, sp
    8008:       f7ff effa       blx     8000 <f>
    800c:       f7ff eff8       blx     8000 <f>
    8010:       bd80            pop     {r7, pc}
Comment 1 Fangrui Song 2020-08-08 18:30:41 PDT
This exposes some challenge because our abstraction for symbol assignment (ExprValue and std::function<ExprValue()>) do not track symbol types. We can add an auxiliary data structure or extend ExprValue with an additional symbol type field. We need to make this elegant.

There are also questions about when the symbol type should be retained. I asked on binutils: https://sourceware.org/pipermail/binutils/2020-August/112706.html

For our purposes, supporting just `alias = aliasee` and not other expressions should probably suffice.
Comment 2 Shoaib Meenai 2020-08-12 13:00:09 PDT
Thanks for starting the conversation with binutils! I agree that just supporting `alias = aliasee` is enough.

Extending ExprValue with either the symbol type or just a pointer to the symbol seem reasonable to me, though I haven't thought about it too deeply. We would also need some way to distinguish between `alias = aliasee` and more complex expressions, correct? I don't think ExprValue tells us that right now (it just tracks the end value of the expression).
Comment 3 Shoaib Meenai 2020-08-19 14:42:21 PDT
This will start blocking an internal codebase soon, so I'm looking into it.
Comment 4 Shoaib Meenai 2020-08-20 09:27:18 PDT
Fangrui is fixing this in https://reviews.llvm.org/D86263. Thank you!
Comment 5 Fangrui Song 2020-08-26 21:41:06 PDT
Fixed by 9670029b6b302c75bb373fb1814f4e02790c4da8 (cherry picked to release/11.x)
Comment 6 Nick Desaulniers 2020-11-16 13:01:57 PST
Just noting that this was a bug affecting the aarch64 Linux kernel's 32b compat VDSO for 4.19.y.  (Everything is fixed, just leaving a note).