Design an ORC C-API / ExecutionEngine replacement #30451

lhames · 2016-11-21T23:36:32Z


Bugzilla Link	31103
Version	trunk
OS	All
CC	@AlexDenisov,@anarazel,@joker-eph,@lhames,@programmerjake,@weliveindetail

Extended Description

Along with llvm/llvm-bugzilla-archive#31101 (unifying the existing in-tree ORC stacks) we should design a new C API for ORC. It may make sense to design a replacement for the C++ ExecutionEngine API alongside this.

I expect the default JIT implementation for any new interface would be provided by the unified stack developed for llvm/llvm-bugzilla-archive#31101 , and it may be worth designing the new stack with the interface in mind.

There is an existing ORC C-API (see http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/OrcBindings.h) that may serve as a starting point, but it's missing some key features (especially support for remote JITing).

A few of important questions off the top of my head:

(1) How should resource ownership (Modules, object files, etc) be modeled in the C-API?
#30244 will significantly influence this by locking ORC to a shared ownership model.

(2) How should remote JITing be handled?

(3) How should this interface interact with the LLVM IR interpreter (if at all?)

Thoughts from existing clients would be most welcome.

lhames · 2020-01-21T21:23:39Z

This bug, particularly the C-API part, needs to be tackled before we can kill off ORCv1.

There are two approaches we can take to C bindings for ORCv2:

(1) We can just write a wrapper for LLJIT. This API would provide functionality similar to the ExecutionEngine bindings.

(2) We write wrappers for all of the low level ORCv2 APIs: ExecutionSession, MaterializationUnits, etc.

These approaches are not mutually exclusive. It would probably be best to start with a wrapper for LLJIT, since existing MCJIT/ORCv1 clients will have a relatively easy time migrating to that. Over time, we can add more direct bindings for ORC components as we discover use cases for them.

lhames · 2020-01-28T19:16:25Z

A quick sketch of what an LLJIT C API might look like (prompted by Andres Freund's question on the JIT Weekly #1 thread on llvm-dev):

/* Utilities for creating and modifying JITTargetMachineBuilders */
enum LLVMJTMBRelocModel { JITDefault, Static, PIC_, DynamicNoPIC, ROPI,
RWPI, ROPI_RWPI };
enum LLVMJTMBCodeModel { JITDefault, Tiny, Small, Kernel, Medium, Large };

LLVMErrorRef
createJITTargetMachineBuilder(LLVMJITTargetMachineBuilderRef *Result,
const char *Triple);

LLVMErrorRef
createJITTargetMachineBuilderForHost(LLVMJITTargetMachineBuliderRef *Result);

void setJITTargetMachineBuilderCPU(LLVMJITTargetMachineBuilderRef Builder,
const char *CPUName);

void setJITTargetMachineBuilderRelocModel(LLVMJITTargetMachineBuilderRef Builder,
LLVMJTMBRelocModel RM);

void setJITTargetMachineBuliderCodeModel(LLVMJITTargetMachineBuilderRef Builder,
LLVMJTMBCodeModel CM);

void disposeJITTargetMachineBuilder(LLVMJITTargetMachineBuilderRef Builder);

/* Side note: We'll eventually want a way for clients to configure EmulatedTLS
(currently on by default, but we'll be able to support native soonish).
Unfortunately there doesn't appear to be C-API for setting target options
yet. We might have to develop one. */

/* Utilities for creating an LLJIT instance, adding IR and object modules, and
creating absolute values, reexports, and lazy reexports, etc. */
LLVMErrorRef createDefaultLLJIT(LLVMLLJITRef *Result, JITTargetMachineBuilderRef Builder);
LLVMErrorRef disposeLLJIT(LLVMLLJITRef J);

LLVMJITDylibRef getMainJITDylib(LLVMJITRef J);

LLVMErrorRef createLLJITJITDylib(LLVMJITDylibRef *Result, LLJITRef J, const char *Name, int Bare = 0);
LLVMErrorRef disposeLLJITJITDylib(LLJITRef J, LLVMJITDylibRef JD);

LLVMErrorRef runInitializersForJITDylib(LLJITRef J, LLVMJITDylibRef JD);
LLVMErrorRef runDeInitializersForJITDylib(LLJITRef J, LLVMJITDylibRef JD);

LLVMErrorRef addModuleToLLJIT(LLJITRef J, LLVMJITDylibRef JD, LLVMModuleRef M);

LLVMErrorRef addObjectToLLJIT(LLJITRef J, LLVMJITDylibRef JD, LLVMMemoryBufferRef B);

LLVMErrorRef addAbsoluteSymbol(LLJITRef J, LLVMJITDylibRef JD, const char *Name,
uint64_t Addr, LLVMJITSymbolFlags F);

LLVMErrorRef addReExport(LLJITRef J, LLVMJITDylibRef TargetJD, const char *TargetName,
LLVMJITSymbolFlags TargetFlags, LLVMJITDylibRef SourceJD,
const char *SourceName);

/* Support for lazy-reexports and lazy compilation */
LLVMErrorRef createORCIndirectStubsMgr(LLVMORCIndirectStubsMgrRef *Result, const char *Triple);
void disposeORCIndirectStubsMgr(LLVMORCIndirectStubsMgrRef ISMgr);

LLVMErrorRef createORCLazyCallThroughMgr(LLVMORCLazyCallThroughMgrRef *Result, const char *Triple);void disposeORCLazyCallThroughMgr(LLVMORCLazyCallThroughMgrRef LCTMgr);

LLVMErrorRef addAbsoluteSymbol(LLJITRef J, LLVMORCIndirectStubsMgrRef ISMgr,
LLVMORCLazyCallThroughMgrRef LCTMgr,
LLVMJITDylibRef TargetJD, const char *TargetName,
LLVMJITSymbolFlags TargetFlags,
LLVMJITDylibRef SourceJD, const char *SourceName);

programmerjake · 2020-01-28T19:38:55Z

A quick sketch of what an LLJIT C API might look like (prompted by Andres
Freund's question on the JIT Weekly #1 thread on llvm-dev):

One thing you may want to add is a method to retrieve target-specific information such as supported SIMD widths, size/alignment for different types, the list of target features enabled, etc.

joker-eph · 2020-01-28T21:44:23Z

I see a createJITTargetMachineBuilder function and a LLVMJITTargetMachineBuilderRef type ; why would the TMBuilder be specific for the JIT?

lhames · 2020-01-28T23:22:40Z

Hi Mehdi,

why would the TMBuilder be specific for the JIT?

I'm not sure it is. JITTargetMachineBuilder is just a factory for building TargetMachines from a specified config (Triple, CPU name, features, options, etc.). The JIT needs the factory so that it can build TargetMachines as required for concurrent compilation, but the idea of a TM factory is probably equally useful for ThinLTO, and perhaps other uses.

There are two differences that I can think of between ThinLTO's use case an the JIT:

(1) ThinLTO knows the number of TMs required up front. The JIT doesn't restrict that number ahead of time.

(2) The JIT is free to aggressively detect and use host features by default, on the assumption that JIT'd code is running on the same machine as the JIT compiler. I suspect ThinLTO has to be more conservative in its defaults.

Difference (1) isn't a blocker to sharing code, and difference (2) could be addressed by adding new construction methods while still sharing the rest of the code.

What are your thoughts?

lhames · 2020-01-28T23:24:25Z

One thing you may want to add is a method to retrieve target-specific
information such as supported SIMD widths, size/alignment for different
types, the list of target features enabled, etc.

Sounds good to me. I think this should be easy to do: They would just be extra functions operating on the JITTargetMachineBuilderRef to expose the getOptions / getFeatures methods on JITTargetMachineBuilder.

anarazel · 2020-02-02T08:43:31Z

Hi,

Comments about the API:

/* Utilities for creating and modifying JITTargetMachineBuilders */
enum LLVMJTMBRelocModel { JITDefault, Static, PIC_, DynamicNoPIC, ROPI,
RWPI, ROPI_RWPI };
enum LLVMJTMBCodeModel { JITDefault, Tiny, Small, Kernel, Medium, Large };

There's already llvm-c/TargetMachine.h

typedef enum {
LLVMRelocDefault,
LLVMRelocStatic,
LLVMRelocPIC,
LLVMRelocDynamicNoPic,
LLVMRelocROPI,
LLVMRelocRWPI,
LLVMRelocROPI_RWPI
} LLVMRelocMode;

typedef enum {
LLVMCodeModelDefault,
LLVMCodeModelJITDefault,
LLVMCodeModelTiny,
LLVMCodeModelSmall,
LLVMCodeModelKernel,
LLVMCodeModelMedium,
LLVMCodeModelLarge
} LLVMCodeModel;

probably nice not to duplicate them. But perhaps that's what you were intending anyway?

LLVMErrorRef
createJITTargetMachineBuilderForHost(LLVMJITTargetMachineBuliderRef *Result);

void setJITTargetMachineBuilderCPU(LLVMJITTargetMachineBuilderRef Builder,
const char *CPUName);

Hm. Would it make sense to instead use the same arguments as LLVMCreateTargetMachine, except it'd create a LLVMJITTargetMachineBuilderRef?

There already is LLVMGetHostCPUFeatures(), LLVMGetHostCPUName() etc.

It's a bit sad to have to duplicate most of TargetMachine.h features for builders. Would it make sense to instead use a TargetMachine passed in at creation time for the non-concurrent case, and clone when more are needed? It's fairly likely that a JIT's code generator would need a target machine around anyway, to get a native LLVMTargetDataRef - the various LLVMTargetDataRef taking functions like
/** Computes the ABI alignment of a type in bytes for a target.
See the method llvm::DataLayout::getTypeABISize. */
unsigned LLVMABIAlignmentOfType(LLVMTargetDataRef TD, LLVMTypeRef Ty);
are pretty crucial.

LLVMErrorRef addObjectToLLJIT(LLJITRef J, LLVMJITDylibRef JD, LLVMMemoryBufferRef B);

Probably also need a way to actually get the object file? Or are you envisioning that that'd only be useful in cases where one generates code outside of Orc?

Do you forsee having a replacement for OrcV1's LLVMOrcSymbolResolverFn SymbolResolver? I'm currently using that to resolve function references that don't "properly" exist in the host binary, because they're in shared libraries that aren't globally visible (and which might have duplicate symbols between them, no guaranteed load order, etc). Or would I have to eagerly provide them with addAbsoluteSymbol() or such?

I think LLJIT::lookup() needs to be exported somehow too?

Regards,

Andres

anarazel · 2020-02-02T08:47:02Z

A quick sketch of what an LLJIT C API might look like (prompted by Andres
Freund's question on the JIT Weekly #1 thread on llvm-dev):

One thing you may want to add is a method to retrieve target-specific
information such as supported SIMD widths, size/alignment for different
types, the list of target features enabled, etc.

Most of that functionality already exists in the C API via LLVMGetHostCPUFeatures(), LLVMGetHostCPUName(), LLVMGetTargetMachineCPU(), LLVMGetTargetMachineFeatureString(), LLVMCreateTargetDataLayout(), LLVMABI* - I don't think it'd be good to fully duplicate it for Orcv2.

lhames · 2020-02-17T18:52:36Z

Hi Andres,

Thanks very much for the feedback! (And patience with this reply)

There's already llvm-c/TargetMachine.h

typedef enum {
LLVMRelocDefault,
LLVMRelocStatic,
LLVMRelocPIC,
LLVMRelocDynamicNoPic,
LLVMRelocROPI,
LLVMRelocRWPI,
LLVMRelocROPI_RWPI
} LLVMRelocMode;

typedef enum {
LLVMCodeModelDefault,
LLVMCodeModelJITDefault,
LLVMCodeModelTiny,
LLVMCodeModelSmall,
LLVMCodeModelKernel,
LLVMCodeModelMedium,
LLVMCodeModelLarge
} LLVMCodeModel;

probably nice not to duplicate them. But perhaps that's what you were
intending anyway?

You're right -- the API should just reuse these.

It's a bit sad to have to duplicate most of TargetMachine.h features for
builders. Would it make sense to instead use a TargetMachine passed in at
creation time for the non-concurrent case, and clone when more are needed?

Yep. There's no cloning API for TargetMachines, but we could take a single TargetMachine as a prototype then use the CPU name, features, etc. from that to construct a builder.

LLVMErrorRef addObjectToLLJIT(LLJITRef J, LLVMJITDylibRef JD, LLVMMemoryBufferRef B);

Probably also need a way to actually get the object file? Or are you
envisioning that that'd only be useful in cases where one generates code
outside of Orc?

I'm not sure I follow? This API is just for adding object files directly to the JIT (in which case the user already had the object file being added).

If you mean an API for caching compiled objects: Yes, that would definitely be a worthwhile thing to add.

Do you forsee having a replacement for OrcV1's LLVMOrcSymbolResolverFn
SymbolResolver? I'm currently using that to resolve function references that
don't "properly" exist in the host binary, because they're in shared
libraries that aren't globally visible (and which might have duplicate
symbols between them, no guaranteed load order, etc). Or would I have to
eagerly provide them with addAbsoluteSymbol() or such?

That would require API for adding JITDylib::DefinitionGenerator instances to JITDylibs, and API for constructing a DynamicLibrarySearchGenerator. That sounds great to me. How about:

LLVMDefinitionGeneratorRef createDylibSearchGenerator(void *DylibHandle);
void addDefinitionGenerator(LLVMJITDylibRef JD, LLVMDefinitionGeneratorRef Generator);

I think LLJIT::lookup() needs to be exported somehow too?

Oops. That's a rather serious omission. :)

A basic implementation should probably look like:

LLVMErrorRef lookup(uint64_t *Result, LLJITRef J, LLVMJITDylibRef JD,
const char *SymbolName);

-- Lang.

lhames · 2020-03-14T21:58:03Z

Ok, to get the ball rolling I've committed some bare-bones OrcV2 C bindings and an example use case in 633ea07.

Is anyone interested in picking this up and running with it? I'm happy to review patches and answer questions, but I'd prefer not to drive this: I'm not a C API user/designer so my design sensibilities and priorities may be wildly off base, and I won't have much time on to work on this so progress would be slow.

lhames · 2020-04-10T00:04:12Z

Basic support for LLJITBuilder and reflecting process symbols has been added in 1cd8493.

Support for adding object file buffers has been added in 0d5f15f.

lhames · 2020-04-10T22:54:20Z

Support for building a JITTargetMachineBuilder from a TargetMachine template (LLVMOrcJITTargetMachineBuilderCreateFromTargetMachine) has been added in 59ed45b.

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design an ORC C-API / ExecutionEngine replacement #30451

Design an ORC C-API / ExecutionEngine replacement #30451

lhames commented Nov 21, 2016

lhames commented Jan 21, 2020

lhames commented Jan 28, 2020

programmerjake commented Jan 28, 2020

joker-eph commented Jan 28, 2020

lhames commented Jan 28, 2020

lhames commented Jan 28, 2020

anarazel commented Feb 2, 2020

anarazel commented Feb 2, 2020

lhames commented Feb 17, 2020

lhames commented Mar 14, 2020

lhames commented Apr 10, 2020

lhames commented Apr 10, 2020