263 – Target Triples and Shared Libraries

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 263 - Target Triples and Shared Libraries

Summary: Target Triples and Shared Libraries

Status:	RESOLVED FIXED

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Core LLVM classes (show other bugs)
Version:	1.0
Hardware:	All All

Importance:	P normal
Assignee:	Reid Spencer

URL:
Keywords:	new-feature

Depends on:
Blocks:	402
	Show dependency tree

Reported:	2004-02-28 14:16 PST by Chris Lattner
Modified:	2010-02-22 12:46 PST (History)
CC List:	2 users (show)

See Also:
Fixed By Commit(s):

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Chris Lattner 2004-02-28 14:16:02 PST

The LLVM Bytecode format, AsmParser, and Module class need to be extended to
support robust target identification and a list of needed shared libraries.

Currently we use endianness and pointer-size to auto-select a code generator to
use with a bytecode file.  This is obviously really limited (ie, we can't
distinguish between i386 and i686, or PPC and SparcV8, for example), and not
complete enough.  Instead, we should allow the front-end to encode a standard
GNU style "target-triple" in the .s file, and propagate it through to the
compilation and optimizations steps.  From this target triple, we can robustly
identify a code generator or TargetMachine to use

Even when target-triple support is added though, we should still keep the
endianness/pointer size bits around, as they are useful for extracting
information about unknown targets.  Also, if a front-end generates portable
code, it should obviously leave the target triple blank, indicating it works
with any target.

While this is being added, it would also make sense to add support for
remembering the shared libraries that a module depends on.  Currently when
'gccld' links a program, it statically links in any libraries in LLVM form, then
forgets the rest.  This requires the "user" to remember which libraries must be
used when compiling to a exe file or running with the JIT.

To fix this, a module should be able to depend on external "libraries" of code,
either in LLVM form or in native form.  This would allow us to "dynamically
link" libstdc++, for example, to C++ programs.  When the JIT start doing
off-line caching and neat stuff like that, it could just load the native code
for a library that is already compiled, instead of JIT compiling the whole
library every time an app uses it.

Though it would be nice to have this before 1.2, it looks unlikely that this
will happen.  I'm just adding this bug so it doesn't get forgotten.

-Chris

Comment 1 Chris Lattner 2004-06-02 17:52:17 PDT

Interestingly enough, the GCC people are starting to realize that they need
something very similar to the library support described in this PR:

http://gcc.gnu.org/ml/gcc/2004-06/msg00116.html

-Chris

Comment 2 Reid Spencer 2004-07-18 08:25:20 PDT

The "depend on library" feature is great and should be added. However, I have 
no idea how a compiler would determine the library dependencies based on its 
input (i.e. the morass of C/C++ header files require what libraries?).  The 
feature can be added to the bytecode/AsmWriter/Module but its unclear how it 
gets used from there.

As for the target triples, I'm thinking this is a bad idea for byte code. One 
of the design goals for bytecode should be target independence. You should be 
able to move a .bc file to any target LLVM supports and run it and get correct 
results.  This is extremely important to LLVM's design, I believe. If we 
encode the target triple into a bc file, what purpose does it serve? To record 
what platform the .bc file was generated on? Who cares? What is needed, is a 
way to specify a target triple to the code generators to indicate what kind of 
(native) code they should generate. This should even support cross compilation.

Comment 3 Chris Lattner 2004-07-18 13:04:41 PDT

> However, I have no idea how a compiler would determine the library
> dependencies based on its input (i.e. the morass of C/C++ header files
> require what libraries?).

In MSIL, each external function specifies which library it comes from.  In C
land, this information is presented to the linker.  The idea is to remember it
after gccld runs.  Also, consider if you link a library X... we want to remember
all of the Y & Z libraries that X depends on, so when we link X to an
application, we also know about Y and Z.

> The feature can be added to the bytecode/AsmWriter/Module but its unclear
> how it gets used from there.

It gets used by the JIT (to dlopen native .so's), and by the mythical magically
compiler driver, to link the output of llc.

> As for the target triples, I'm thinking this is a bad idea for byte code. One 
> of the design goals for bytecode should be target independence.

One of the nice things about llvm bytecode is that if the source language is
target-indep, so will the LLVM bytecode.  However, C and C++ are not, and there
is no way to guarantee target independent bytecode.  e.g.:

int X[sizeof(void*)];

Cannot be compiled to something that is target independent.

> If we encode the target triple into a bc file, what purpose does it serve? 

There are a couple of things, but the most important is the ability to pick
target machines, and the ability to support target-specific features like
calling conventions (fastcall,cdecl,thiscall,pascal,-fregparam,etc).  We cannot
replace GCC unless we can operate as a great target-specific compiler as well as
a target independent compiler.

-Chris

Comment 4 Reid Spencer 2004-07-25 12:46:05 PDT

Mine

Comment 5 Reid Spencer 2004-07-25 16:24:49 PDT

Fixed.

Comment 6 Reid Spencer 2004-07-25 17:04:16 PDT

Erm, not quite fixed.

The Bytecode, AsmWriter, and AsmParser parts of this bug are done and tested.

What remains is the portion that actually uses the information in the Linker and
JIT. I will leave this to others more knowledeable about that code.

Comment 7 Chris Lattner 2004-11-18 14:49:04 PST

The C/C++ front-end is now producing shared library and target triple info.

Comment 8 Reid Spencer 2004-11-25 03:54:12 PST

All the linking code has been consolidated into lib/Linker and all three linkers
now use this library. Furthermore, the dependent libraries feature is now being
used by lib/Linker to automatically resolve dependent libraries. 

This bug is 1/2 complete. The target-triple support still needs to be added.

Comment 9 Reid Spencer 2004-11-25 15:35:12 PST

Scheduled for 1.5

Comment 10 Chris Lattner 2004-12-10 14:27:18 PST

The linker now handles TT support:
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20041206/022070.html

What is left to close this bug?

-Chris

Comment 11 Misha Brukman 2004-12-10 14:30:25 PST

Methinks to close this bug, targets need to be selected based on
pattern-matching the Module's target triple.

Comment 12 Reid Spencer 2004-12-10 14:50:50 PST

Not only that, but code generation needs to into account sub-targets. Re-read
the initial posting on this bug. We should be able to generate code that is
suitable for a 386 on up. Same thing with variants of PowerPC and Sparc. I don't
think we have sub-target support yet. Providing the target-triple is just the
tip of the ice berg in my perspective. Perhaps the sub-target support and
*using* the target-triple is another task.

There's something that still bother's me about all this. While we need to
support front ends that are machine specific (e.g. C) and the target-triple
seems to do that, why should that affect the way code is generated? I.e. don't
we really need two things here? One is the target-triple and the other is the
actual machine for which code should be generated?

Comment 13 Chris Lattner 2004-12-10 14:55:32 PST

> Not only that, but code generation needs to into account sub-targets.

No, that is a separate issue.  This bug is just about getting the information
into LLVM so we CAN do that.

> There's something that still bother's me about all this. While we need to
> support front ends that are machine specific (e.g. C) and the target-triple
> seems to do that, why should that affect the way code is generated?

This information is really only used for one thing: making cross compilers
transparent.

The heuristic for target selection is:

1. If -march is specified, use it.
2. If t-t is specified, use it.
3. Otherwise, pick the appropriate target based on the host.

-Chris

Comment 14 Chris Lattner 2004-12-12 11:42:17 PST

Fixed.  The last piece of this was making targets autoselect themselves based on
the target triple.  This is implemented here:

http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20041206/022136.html
..
http://mail.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20041206/022139.html

-Chris