LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 263 - Target Triples and Shared Libraries
Summary: Target Triples and Shared Libraries
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Core LLVM classes (show other bugs)
Version: 1.0
Hardware: All All
: P normal
Assignee: Reid Spencer
URL:
Keywords: new-feature
Depends on:
Blocks: 402
  Show dependency tree
 
Reported: 2004-02-28 14:16 PST by Chris Lattner
Modified: 2010-02-22 12:46 PST (History)
2 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Lattner 2004-02-28 14:16:02 PST
The LLVM Bytecode format, AsmParser, and Module class need to be extended to
support robust target identification and a list of needed shared libraries.

Currently we use endianness and pointer-size to auto-select a code generator to
use with a bytecode file.  This is obviously really limited (ie, we can't
distinguish between i386 and i686, or PPC and SparcV8, for example), and not
complete enough.  Instead, we should allow the front-end to encode a standard
GNU style "target-triple" in the .s file, and propagate it through to the
compilation and optimizations steps.  From this target triple, we can robustly
identify a code generator or TargetMachine to use

Even when target-triple support is added though, we should still keep the
endianness/pointer size bits around, as they are useful for extracting
information about unknown targets.  Also, if a front-end generates portable
code, it should obviously leave the target triple blank, indicating it works
with any target.

While this is being added, it would also make sense to add support for
remembering the shared libraries that a module depends on.  Currently when
'gccld' links a program, it statically links in any libraries in LLVM form, then
forgets the rest.  This requires the "user" to remember which libraries must be
used when compiling to a exe file or running with the JIT.

To fix this, a module should be able to depend on external "libraries" of code,
either in LLVM form or in native form.  This would allow us to "dynamically
link" libstdc++, for example, to C++ programs.  When the JIT start doing
off-line caching and neat stuff like that, it could just load the native code
for a library that is already compiled, instead of JIT compiling the whole
library every time an app uses it.

Though it would be nice to have this before 1.2, it looks unlikely that this
will happen.  I'm just adding this bug so it doesn't get forgotten.

-Chris
Comment 1 Chris Lattner 2004-06-02 17:52:17 PDT
Interestingly enough, the GCC people are starting to realize that they need
something very similar to the library support described in this PR:

http://gcc.gnu.org/ml/gcc/2004-06/msg00116.html

-Chris
Comment 2 Reid Spencer 2004-07-18 08:25:20 PDT
The "depend on library" feature is great and should be added. However, I have 
no idea how a compiler would determine the library dependencies based on its 
input (i.e. the morass of C/C++ header files require what libraries?).  The 
feature can be added to the bytecode/AsmWriter/Module but its unclear how it 
gets used from there.

As for the target triples, I'm thinking this is a bad idea for byte code. One 
of the design goals for bytecode should be target independence. You should be 
able to move a .bc file to any target LLVM supports and run it and get correct 
results.  This is extremely important to LLVM's design, I believe. If we 
encode the target triple into a bc file, what purpose does it serve? To record 
what platform the .bc file was generated on? Who cares? What is needed, is a 
way to specify a target triple to the code generators to indicate what kind of 
(native) code they should generate. This should even support cross compilation.

Comment 3 Chris Lattner 2004-07-18 13:04:41 PDT
> However, I have no idea how a compiler would determine the library
> dependencies based on its input (i.e. the morass of C/C++ header files
> require what libraries?).

In MSIL, each external function specifies which library it comes from.  In C
land, this information is presented to the linker.  The idea is to remember it
after gccld runs.  Also, consider if you link a library X... we want to remember
all of the Y & Z libraries that X depends on, so when we link X to an
application, we also know about Y and Z.

> The feature can be added to the bytecode/AsmWriter/Module but its unclear
> how it gets used from there.

It gets used by the JIT (to dlopen native .so's), and by the mythical magically
compiler driver, to link the output of llc.

> As for the target triples, I'm thinking this is a bad idea for byte code. One 
> of the design goals for bytecode should be target independence.

One of the nice things about llvm bytecode is that if the source language is
target-indep, so will the LLVM bytecode.  However, C and C++ are not, and there
is no way to guarantee target independent bytecode.  e.g.:

int X[sizeof(void*)];

Cannot be compiled to something that is target independent.

> If we encode the target triple into a bc file, what purpose does it serve? 

There are a couple of things, but the most important is the ability to pick
target machines, and the ability to support target-specific features like
calling conventions (fastcall,cdecl,thiscall,pascal,-fregparam,etc).  We cannot
replace GCC unless we can operate as a great target-specific compiler as well as
a target independent compiler.

-Chris
Comment 4 Reid Spencer 2004-07-25 12:46:05 PDT
Mine
Comment 5 Reid Spencer 2004-07-25 16:24:49 PDT
Fixed.
Comment 6 Reid Spencer 2004-07-25 17:04:16 PDT
Erm, not quite fixed.

The Bytecode, AsmWriter, and AsmParser parts of this bug are done and tested.

What remains is the portion that actually uses the information in the Linker and
JIT. I will leave this to others more knowledeable about that code.
Comment 7 Chris Lattner 2004-11-18 14:49:04 PST
The C/C++ front-end is now producing shared library and target triple info.
Comment 8 Reid Spencer 2004-11-25 03:54:12 PST
All the linking code has been consolidated into lib/Linker and all three linkers
now use this library. Furthermore, the dependent libraries feature is now being
used by lib/Linker to automatically resolve dependent libraries. 

This bug is 1/2 complete. The target-triple support still needs to be added.
Comment 9 Reid Spencer 2004-11-25 15:35:12 PST
Scheduled for 1.5
Comment 10 Chris Lattner 2004-12-10 14:27:18 PST
The linker now handles TT support:
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20041206/022070.html

What is left to close this bug?

-Chris
Comment 11 Misha Brukman 2004-12-10 14:30:25 PST
Methinks to close this bug, targets need to be selected based on
pattern-matching the Module's target triple.
Comment 12 Reid Spencer 2004-12-10 14:50:50 PST
Not only that, but code generation needs to into account sub-targets. Re-read
the initial posting on this bug. We should be able to generate code that is
suitable for a 386 on up. Same thing with variants of PowerPC and Sparc. I don't
think we have sub-target support yet. Providing the target-triple is just the
tip of the ice berg in my perspective. Perhaps the sub-target support and
*using* the target-triple is another task.

There's something that still bother's me about all this. While we need to
support front ends that are machine specific (e.g. C) and the target-triple
seems to do that, why should that affect the way code is generated? I.e. don't
we really need two things here? One is the target-triple and the other is the
actual machine for which code should be generated?
Comment 13 Chris Lattner 2004-12-10 14:55:32 PST
> Not only that, but code generation needs to into account sub-targets.

No, that is a separate issue.  This bug is just about getting the information
into LLVM so we CAN do that.

> There's something that still bother's me about all this. While we need to
> support front ends that are machine specific (e.g. C) and the target-triple
> seems to do that, why should that affect the way code is generated?

This information is really only used for one thing: making cross compilers
transparent.

The heuristic for target selection is:

1. If -march is specified, use it.
2. If t-t is specified, use it.
3. Otherwise, pick the appropriate target based on the host.

-Chris
Comment 14 Chris Lattner 2004-12-12 11:42:17 PST
Fixed.  The last piece of this was making targets autoselect themselves based on
the target triple.  This is implemented here:

http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20041206/022136.html
..
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20041206/022139.html

-Chris