LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 31103 - Design an ORC C-API / ExecutionEngine replacement
Summary: Design an ORC C-API / ExecutionEngine replacement
Status: NEW
Alias: None
Product: libraries
Classification: Unclassified
Component: OrcJIT (show other bugs)
Version: trunk
Hardware: PC All
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-11-21 15:36 PST by Lang Hames
Modified: 2020-04-10 15:54 PDT (History)
7 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Lang Hames 2016-11-21 15:36:32 PST
Along with http://llvm.org/PR31101 (unifying the existing in-tree ORC stacks) we should design a new C API for ORC. It may make sense to design a replacement for the C++ ExecutionEngine API alongside this.

I expect the default JIT implementation for any new interface would be provided by the unified stack developed for http://llvm.org/PR31101, and it may be worth designing the new stack with the interface in mind.

There is an existing ORC C-API (see http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm-c/OrcBindings.h) that may serve as a starting point, but it's missing some key features (especially support for remote JITing).

A few of important questions off the top of my head:

(1) How should resource ownership (Modules, object files, etc) be modeled in the C-API?
http://llvm.org/PR30896 will significantly influence this by locking ORC to a shared ownership model.

(2) How should remote JITing be handled?

(3) How should this interface interact with the LLVM IR interpreter (if at all?)


Thoughts from existing clients would be most welcome.
Comment 1 Lang Hames 2020-01-21 13:23:39 PST
This bug, particularly the C-API part, needs to be tackled before we can kill off ORCv1.

There are two approaches we can take to C bindings for ORCv2:

(1) We can just write a wrapper for LLJIT. This API would provide functionality similar to the ExecutionEngine bindings.

(2) We write wrappers for all of the low level ORCv2 APIs: ExecutionSession, MaterializationUnits, etc.

These approaches are not mutually exclusive. It would probably be best to start with a wrapper for LLJIT, since existing MCJIT/ORCv1 clients will have a relatively easy time migrating to that. Over time, we can add more direct bindings for ORC components as we discover use cases for them.
Comment 2 Lang Hames 2020-01-28 11:16:25 PST
A quick sketch of what an LLJIT C API might look like (prompted by Andres Freund's question on the JIT Weekly #1 thread on llvm-dev):

/* Utilities for creating and modifying JITTargetMachineBuilders */
enum LLVMJTMBRelocModel { JITDefault, Static, PIC_, DynamicNoPIC, ROPI,
                          RWPI, ROPI_RWPI };
enum LLVMJTMBCodeModel { JITDefault, Tiny, Small, Kernel, Medium, Large }; 

LLVMErrorRef
createJITTargetMachineBuilder(LLVMJITTargetMachineBuilderRef *Result,
                              const char *Triple);

LLVMErrorRef
createJITTargetMachineBuilderForHost(LLVMJITTargetMachineBuliderRef *Result);

void setJITTargetMachineBuilderCPU(LLVMJITTargetMachineBuilderRef Builder,
                                   const char *CPUName);

void setJITTargetMachineBuilderRelocModel(LLVMJITTargetMachineBuilderRef Builder,
                                          LLVMJTMBRelocModel RM);

void setJITTargetMachineBuliderCodeModel(LLVMJITTargetMachineBuilderRef Builder,
                                         LLVMJTMBCodeModel CM);

void disposeJITTargetMachineBuilder(LLVMJITTargetMachineBuilderRef Builder);

/* Side note: We'll eventually want a way for clients to configure EmulatedTLS
   (currently on by default, but we'll be able to support native soonish).
   Unfortunately there doesn't appear to be C-API for setting target options
   yet. We might have to develop one. */

/* Utilities for creating an LLJIT instance, adding IR and object modules, and
   creating absolute values, reexports, and lazy reexports, etc. */
LLVMErrorRef createDefaultLLJIT(LLVMLLJITRef *Result, JITTargetMachineBuilderRef Builder);
LLVMErrorRef disposeLLJIT(LLVMLLJITRef J);

LLVMJITDylibRef getMainJITDylib(LLVMJITRef J);

LLVMErrorRef createLLJITJITDylib(LLVMJITDylibRef *Result, LLJITRef J, const char *Name, int Bare = 0);
LLVMErrorRef disposeLLJITJITDylib(LLJITRef J, LLVMJITDylibRef JD);

LLVMErrorRef runInitializersForJITDylib(LLJITRef J, LLVMJITDylibRef JD);
LLVMErrorRef runDeInitializersForJITDylib(LLJITRef J, LLVMJITDylibRef JD);

LLVMErrorRef addModuleToLLJIT(LLJITRef J, LLVMJITDylibRef JD, LLVMModuleRef M);

LLVMErrorRef addObjectToLLJIT(LLJITRef J, LLVMJITDylibRef JD, LLVMMemoryBufferRef B);

LLVMErrorRef addAbsoluteSymbol(LLJITRef J, LLVMJITDylibRef JD, const char *Name,
                               uint64_t Addr, LLVMJITSymbolFlags F);

LLVMErrorRef addReExport(LLJITRef J, LLVMJITDylibRef TargetJD, const char *TargetName,
                         LLVMJITSymbolFlags TargetFlags, LLVMJITDylibRef SourceJD,
                         const char *SourceName);

/* Support for lazy-reexports and lazy compilation */
LLVMErrorRef createORCIndirectStubsMgr(LLVMORCIndirectStubsMgrRef *Result, const char *Triple);
void disposeORCIndirectStubsMgr(LLVMORCIndirectStubsMgrRef ISMgr);

LLVMErrorRef createORCLazyCallThroughMgr(LLVMORCLazyCallThroughMgrRef *Result, const char *Triple);void disposeORCLazyCallThroughMgr(LLVMORCLazyCallThroughMgrRef LCTMgr);

LLVMErrorRef addAbsoluteSymbol(LLJITRef J, LLVMORCIndirectStubsMgrRef ISMgr,
                               LLVMORCLazyCallThroughMgrRef LCTMgr,
                               LLVMJITDylibRef TargetJD, const char *TargetName,
                               LLVMJITSymbolFlags TargetFlags,
                               LLVMJITDylibRef SourceJD, const char *SourceName);
Comment 3 Jacob Lifshay 2020-01-28 11:38:55 PST
(In reply to Lang Hames from comment #2)
> A quick sketch of what an LLJIT C API might look like (prompted by Andres
> Freund's question on the JIT Weekly #1 thread on llvm-dev):

One thing you may want to add is a method to retrieve target-specific information such as supported SIMD widths, size/alignment for different types, the list of target features enabled, etc.
Comment 4 Mehdi Amini 2020-01-28 13:44:23 PST
I see a `createJITTargetMachineBuilder` function and a `LLVMJITTargetMachineBuilderRef` type ; why would the TMBuilder be specific for the JIT?
Comment 5 Lang Hames 2020-01-28 15:22:40 PST
Hi Mehdi,

> why would the TMBuilder be specific for the JIT?

I'm not sure it is. JITTargetMachineBuilder is just a factory for building TargetMachines from a specified config (Triple, CPU name, features, options, etc.). The JIT needs the factory so that it can build TargetMachines as required for concurrent compilation, but the idea of a TM factory is probably equally useful for ThinLTO, and perhaps other uses.

There are two differences that I can think of between ThinLTO's use case an the JIT:

(1) ThinLTO knows the number of TMs required up front. The JIT doesn't restrict that number ahead of time.

(2) The JIT is free to aggressively detect and use host features by default, on the assumption that JIT'd code is running on the same machine as the JIT compiler. I suspect ThinLTO has to be more conservative in its defaults.

Difference (1) isn't a blocker to sharing code, and difference (2) could be addressed by adding new construction methods while still sharing the rest of the code.

What are your thoughts?
Comment 6 Lang Hames 2020-01-28 15:24:25 PST
(In reply to Jacob Lifshay from comment #3)
> 
> One thing you may want to add is a method to retrieve target-specific
> information such as supported SIMD widths, size/alignment for different
> types, the list of target features enabled, etc.

Sounds good to me. I think this should be easy to do: They would just be extra functions operating on the JITTargetMachineBuilderRef to expose the getOptions / getFeatures methods on JITTargetMachineBuilder.
Comment 7 Andres Freund 2020-02-02 00:43:31 PST
Hi,

(In reply to Lang Hames from comment #2)
Comments about the API:
> /* Utilities for creating and modifying JITTargetMachineBuilders */
> enum LLVMJTMBRelocModel { JITDefault, Static, PIC_, DynamicNoPIC, ROPI,
>                           RWPI, ROPI_RWPI };
> enum LLVMJTMBCodeModel { JITDefault, Tiny, Small, Kernel, Medium, Large }; 

There's already llvm-c/TargetMachine.h

typedef enum {
    LLVMRelocDefault,
    LLVMRelocStatic,
    LLVMRelocPIC,
    LLVMRelocDynamicNoPic,
    LLVMRelocROPI,
    LLVMRelocRWPI,
    LLVMRelocROPI_RWPI
} LLVMRelocMode;

typedef enum {
    LLVMCodeModelDefault,
    LLVMCodeModelJITDefault,
    LLVMCodeModelTiny,
    LLVMCodeModelSmall,
    LLVMCodeModelKernel,
    LLVMCodeModelMedium,
    LLVMCodeModelLarge
} LLVMCodeModel;

probably nice not to duplicate them. But perhaps that's what you were intending anyway?


>  LLVMErrorRef
createJITTargetMachineBuilderForHost(LLVMJITTargetMachineBuliderRef *Result);

> void setJITTargetMachineBuilderCPU(LLVMJITTargetMachineBuilderRef Builder,
>                                    const char *CPUName);

Hm. Would it make sense to instead use the same arguments as LLVMCreateTargetMachine, except it'd create a LLVMJITTargetMachineBuilderRef?

There already is LLVMGetHostCPUFeatures(), LLVMGetHostCPUName() etc.

It's a bit sad to have to duplicate most of TargetMachine.h features for builders. Would it make sense to instead use a TargetMachine passed in at creation time for the non-concurrent case, and clone when more are needed?  It's fairly likely that a JIT's code generator would need a target machine around anyway, to get a native LLVMTargetDataRef - the various LLVMTargetDataRef taking functions like
/** Computes the ABI alignment of a type in bytes for a target.
    See the method llvm::DataLayout::getTypeABISize. */
unsigned LLVMABIAlignmentOfType(LLVMTargetDataRef TD, LLVMTypeRef Ty);
are pretty crucial.


>  LLVMErrorRef addObjectToLLJIT(LLJITRef J, LLVMJITDylibRef JD, LLVMMemoryBufferRef B);

Probably also need a way to actually get the object file? Or are you envisioning that that'd only be useful in cases where one generates code outside of Orc?

Do you forsee having a replacement for OrcV1's LLVMOrcSymbolResolverFn SymbolResolver? I'm currently using that to resolve function references that don't "properly" exist in the host binary, because they're in shared libraries that aren't globally visible (and which might have duplicate symbols between them, no guaranteed load order, etc).  Or would I have to eagerly provide them with addAbsoluteSymbol() or such?


I think LLJIT::lookup() needs to be exported somehow too?

Regards,

Andres
Comment 8 Andres Freund 2020-02-02 00:47:02 PST
(In reply to Jacob Lifshay from comment #3)
> (In reply to Lang Hames from comment #2)
> > A quick sketch of what an LLJIT C API might look like (prompted by Andres
> > Freund's question on the JIT Weekly #1 thread on llvm-dev):
> 
> One thing you may want to add is a method to retrieve target-specific
> information such as supported SIMD widths, size/alignment for different
> types, the list of target features enabled, etc.

Most of that functionality already exists in the C API via LLVMGetHostCPUFeatures(), LLVMGetHostCPUName(), LLVMGetTargetMachineCPU(), LLVMGetTargetMachineFeatureString(), LLVMCreateTargetDataLayout(), LLVMABI* - I don't think it'd be good to fully duplicate it for Orcv2.
Comment 9 Lang Hames 2020-02-17 10:52:36 PST
Hi Andres,

Thanks very much for the feedback! (And patience with this reply)

(In reply to Andres Freund from comment #7)
> There's already llvm-c/TargetMachine.h
> 
> typedef enum {
>     LLVMRelocDefault,
>     LLVMRelocStatic,
>     LLVMRelocPIC,
>     LLVMRelocDynamicNoPic,
>     LLVMRelocROPI,
>     LLVMRelocRWPI,
>     LLVMRelocROPI_RWPI
> } LLVMRelocMode;
> 
> typedef enum {
>     LLVMCodeModelDefault,
>     LLVMCodeModelJITDefault,
>     LLVMCodeModelTiny,
>     LLVMCodeModelSmall,
>     LLVMCodeModelKernel,
>     LLVMCodeModelMedium,
>     LLVMCodeModelLarge
> } LLVMCodeModel;
> 
> probably nice not to duplicate them. But perhaps that's what you were
> intending anyway?

You're right -- the API should just reuse these.

> It's a bit sad to have to duplicate most of TargetMachine.h features for
> builders. Would it make sense to instead use a TargetMachine passed in at
> creation time for the non-concurrent case, and clone when more are needed?

Yep. There's no cloning API for TargetMachines, but we could take a single TargetMachine as a prototype then use the CPU name, features, etc. from that to construct a builder.
 
> >  LLVMErrorRef addObjectToLLJIT(LLJITRef J, LLVMJITDylibRef JD, LLVMMemoryBufferRef B);
> 
> Probably also need a way to actually get the object file? Or are you
> envisioning that that'd only be useful in cases where one generates code
> outside of Orc?

I'm not sure I follow? This API is just for adding object files directly to the JIT (in which case the user already had the object file being added).

If you mean an API for caching compiled objects: Yes, that would definitely be a worthwhile thing to add.
 
> Do you forsee having a replacement for OrcV1's LLVMOrcSymbolResolverFn
> SymbolResolver? I'm currently using that to resolve function references that
> don't "properly" exist in the host binary, because they're in shared
> libraries that aren't globally visible (and which might have duplicate
> symbols between them, no guaranteed load order, etc).  Or would I have to
> eagerly provide them with addAbsoluteSymbol() or such?

That would require API for adding JITDylib::DefinitionGenerator instances to JITDylibs, and API for constructing a DynamicLibrarySearchGenerator. That sounds great to me. How about:

LLVMDefinitionGeneratorRef createDylibSearchGenerator(void *DylibHandle);
void addDefinitionGenerator(LLVMJITDylibRef JD, LLVMDefinitionGeneratorRef Generator);
 

> I think LLJIT::lookup() needs to be exported somehow too?

Oops. That's a rather serious omission. :)

A basic implementation should probably look like:

LLVMErrorRef lookup(uint64_t *Result, LLJITRef J, LLVMJITDylibRef JD,
                    const char *SymbolName);

-- Lang.
Comment 10 Lang Hames 2020-03-14 14:58:03 PDT
Ok, to get the ball rolling I've committed some bare-bones OrcV2 C bindings and an example use case in 633ea07200ea055320dcd0ecad32639bd95aac59.

Is anyone interested in picking this up and running with it? I'm happy to review patches and answer questions, but I'd prefer not to drive this: I'm not a C API user/designer so my design sensibilities and priorities may be wildly off base, and I won't have much time on to work on this so progress would be slow.
Comment 11 Lang Hames 2020-04-09 17:04:12 PDT
Basic support for LLJITBuilder and reflecting process symbols has been added in 1cd8493e69ba37f68e6e9a03b8c6b24cb5f15fa4.

Support for adding object file buffers has been added in 0d5f15f7000928273aea305d6cff7ac7c1aa352f.
Comment 12 Lang Hames 2020-04-10 15:54:20 PDT
Support for building a JITTargetMachineBuilder from a TargetMachine template (LLVMOrcJITTargetMachineBuilderCreateFromTargetMachine) has been added in 59ed45b4835.