LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 27551 - Bitcode files should have a symbol table similar to object files
Summary: Bitcode files should have a symbol table similar to object files
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Bitcode Writer (show other bugs)
Version: trunk
Hardware: PC Linux
: P normal
Assignee: Mehdi Amini
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-04-28 08:08 PDT by Rafael Ávila de Espíndola
Modified: 2017-06-27 16:51 PDT (History)
4 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Rafael Ávila de Espíndola 2016-04-28 08:08:03 PDT
Using bitcode files in contexts where one normally uses an object file is a bit odd.

For example, during symbol resolution part of a linker, the linker can look at just the symbol table of the object file and the names it looks up can be StringRefs pointing to the underlying mmap.

The situation is very different with bitcode files

* The names are stored in a bitcode record and we have to parse quite a bit to make them available.
* The names are compressed (char6), so one cannot just point a StringRef at them.
* Given the mentioned complexity, it we would have to duplicate quite a bit of code to read the names, so everyone just builds a Module.
* The names are not final in some case. For example, in MachO they are missing a leading _ and so have to be passed to the Mangler before the linker can use them in symbol resolution.
* Symbol names from inline assembly are not included anywhere. One has to parse the assembly to get them.

This creates amusing issues like lib/Object depending on MC and llvm-nm depending on Target.

I think the solution to these problem is to add a blob with a symbol table to the bitcode file. The table would
* Include the *final* name symbols (_foo, not foo).
* Not be compressed so that be can drop the IRObjectFile and still keep StringRef.
* Be easy to parse without a LLVMContext.
* Include names created by inline assembly.
* Include other information a linker or nm would want: linkage, visbility, comdat
Comment 1 Mehdi Amini 2016-04-28 10:08:07 PDT
I have exactly the same reasoning and I have plan to do that, this year hopefully.
Comment 2 Mehdi Amini 2016-08-03 15:57:12 PDT
As a first step: abstracting away the IR when manipulating a bitcode files as an object file (llvm-nm, ranlib, libLTO symbol resolution…)

Patch out for review: https://reviews.llvm.org/D23132
Comment 3 Peter Collingbourne 2017-06-27 16:51:54 PDT
This was fixed in a series of changes that ended in r306488.