Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ld64.lld.darwinnew produces invalid debug info, causing lldb to err with "N_SO in symbol with UID 145239 has invalid sibling in debug map, please file a bug and attach the binary listed in this error" #48058

Closed
nico opened this issue Jan 11, 2021 · 7 comments
Assignees
Labels
bugzilla Issues migrated from bugzilla lld:MachO

Comments

@nico
Copy link
Contributor

nico commented Jan 11, 2021

Bugzilla Link 48714
Resolution FIXED
Resolved on Apr 07, 2021 09:19
Version unspecified
OS Linux
Blocks #48803
CC @gkmhub,@int3,@smeenai
Fixed by commit(s) rG982e3c05108b606701d99d43098331357d9dd0ca

Extended Description

Known issue, but I figured I'd file a bug so that I can link to it. Possible repro, extracted from #48001 #c5 (there are likely way smaller repros):

  1. Download https://drive.google.com/file/d/1thKfcfKUMhyJ22HRSjorIKZjH42k3bnZ/view?usp=sharing (warning: large, 1GB compressed, 5.3GB unzipped -- but it links very quickly, less than a second with both linkers).

  2. Link as usual (ld64.lld.darwinnew @​response.txt)

  3. Run like e.g. so: lldb -- ./mksnapshot --turbo_instruction_scheduling --target_os=mac --target_arch=x64 --embedded_src embedded.S --embedded_variant Default --random-seed 314159265 --startup_blob snapshot_blob.bin --native-code-counters --verify-heap

When loading the lld-linked binary into lldb, it prints many lines looking like

error: mksnapshot N_SO in symbol with UID 145239 has invalid sibling in debug map, please file a bug and attach the binary listed in this error

This doesn't happen with ld.

(Once this is fixed, debug info isn't terribly useful without the actual source files somewhere. Due to https://blog.llvm.org/2019/11/deterministic-builds-with-clang-and-lld.html one has to run settings set target.source-map ../.. actual/local/path/to/src in lldb even if src files are available locally somewhere.)

@nico
Copy link
Contributor Author

nico commented Jan 11, 2021

assigned to @int3

@int3
Copy link
Contributor

int3 commented Apr 6, 2021

Seems like fixing the function size calculation didn't address this problem. I will investigate this soonish.

@int3
Copy link
Contributor

int3 commented Apr 6, 2021

Subscribing Greg Clayton in case he has ideas.

@llvmbot
Copy link
Collaborator

llvmbot commented Apr 7, 2021

So LLDB, when parsing a symbol table, will look for N_SO symbols and it tries to match up a N_SO symbol with a name (source path) to the N_SO symbol without a name.

So for this binary:

$ dsymutil -s a.out

Symbol table for: 'a.out' (x86_64)

Index n_strx n_type n_sect n_desc n_value
======== -------- ------------------ ------ ------ ----------------
[ 0] 00000035 0e ( SECT ) 08 0000 0000000100008008 '__dyld_private'
[ 1] 00000044 64 (N_SO ) 00 0000 0000000000000000 '/Users/gclayton/Documents/src/args/'
[ 2] 00000068 64 (N_SO ) 00 0000 0000000000000000 'main.cpp'
[ 3] 00000071 66 (N_OSO ) 03 0001 0000000060664660 '/Users/gclayton/Documents/src/args/main.o'
[ 4] 00000001 2e (N_BNSYM ) 01 0000 0000000100003ef0
[ 5] 0000009b 24 (N_FUN ) 01 0000 0000000100003ef0 '_main'
[ 6] 00000001 24 (N_FUN ) 00 0000 0000000000000084
[ 7] 00000001 4e (N_ENSYM ) 01 0000 0000000000000084
[ 8] 00000001 64 (N_SO ) 01 0000 0000000000000000
[ 9] 00000002 0f ( SECT EXT) 01 0010 0000000100000000 '__mh_execute_header'
[ 10] 0000001 0f ( SECT EXT) 01 0000 0000000100003ef0 '_main'
[ 11] 0000001c 01 ( UNDF EXT) 00 0200 0000000000000000 '_printf'
[ 12] 00000024 01 ( UNDF EXT) 00 0200 0000000000000000 'dyld_stub_binder'

LLDB will simplify the symbol table like so:

(lldb) image dump symtab a.out
Symtab, file = /Users/gclayton/Documents/src/args/a.out, num_symbols = 7:
Debug symbol
|Synthetic symbol
||Externally Visible
|||
Index UserID DSX Type File Address/Value Load Address Size Flags Name


[ 0] 1 D SourceFile 0x0000000000000000 Sibling -> [ 3] 0x00640000 /Users/gclayton/Documents/src/args/main.cpp
[ 1] 3 D ObjectFile 0x0000000060664660 0x0000000000000000 0x00660001 /Users/gclayton/Documents/src/args/main.o
[ 2] 5 D X Code 0x0000000100003ef0 0x0000000000000084 0x000f0000 main
[ 3] 0 Data 0x0000000100008008 0x0000000000000008 0x000e0000 _dyld_private
[ 4] 9 X Data 0x0000000100000000 0x0000000000003ef0 0x000f0010 _mh_execute_header
[ 5] 11 Trampoline 0x0000000100003f74 0x0000000000000006 0x00010200 printf
[ 6] 12 X Undefined 0x0000000000000000 0x0000000000000000 0x00010200 dyld_stub_binder

Note that the UserID column refers to the original symbol index. Since the mach-o symbol table has so many duplicate symbol entries for something (like '_main' is described by symbols 5, 6 and 10, but LLDB will make only a single symbol for it in the symbol table that LLDB uses.

We see that the first symbol represents the N_SO:
[ 0] 1 D SourceFile 0x0000000000000000 Sibling -> [ 3] 0x00640000 /Users/gclayton/Documents/src/args/main.cpp

It points to a sibling symbol (as we see from "Sibling -> [ 3]"), or the first symbol that doesn't belong to the N_SO. This lets us simplify the symbol table that LLDB uses, but it still maintains the original scoping where all symbols from the first N_SO with a path and the last N_SO with no name create a scope where everything inside belongs to the source file.

So a few things could cause this:

  • you have a N_SO in your execute with a name that isn't followed by a N_SO with no name (LLD bug)
  • you have have a valid N_SO in your executable with a name and you have a N_SO with no name, but it is the last symbol in the symbol table (LLDB bug) and LLDB is complaining. The error message points to the N_SO with a UID of 145239, so that means if you dump the symbol table of the binary that causes this error, using "dsymutil -s /path/to/binary", then look for the symbol with index 145239, that should point you to the N_SO symbol that doesn't have a N_SO symbol after it with no name, or it might be that the N_SO symbol is the last symbol in the symbol table (this would be very rare as all local symbols usually come first in the mach-o symbol table, followed by exported symbols and then by undefined symbols

@int3
Copy link
Contributor

int3 commented Apr 7, 2021

Ohhh. Our intended N_SOs with no name had a string index of zero, but because of D89639, it meant that they were pointing to a string with a single space, rather than an empty string. This was not at all obvious with llvm-nm because it doesn't quote strings, so a single space is indistinguishable from the empty string. But it's a lot more obvious with dsymutil -s since that does quote its strings.

Thanks Greg!!

@llvmbot
Copy link
Collaborator

llvmbot commented Apr 7, 2021

The reason I originally wrote "dsymutil -s" in the first dsymutil was so I could see exactly what was in the mach-o symbol table and the quotes helped way back when. Glad you figured it out from my explanation!

@int3
Copy link
Contributor

int3 commented Nov 27, 2021

mentioned in issue #48803

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla lld:MachO
Projects
None yet
Development

No branches or pull requests

3 participants