LLVM 20.0.0git
|
An interface to create LLVM-IR for OpenMP directives. More...
#include "llvm/Frontend/OpenMP/OMPIRBuilder.h"
Classes | |
struct | AtomicOpValue |
a struct to pack relevant information while generating atomic Ops More... | |
struct | CopyOptionsTy |
struct | DependData |
A struct to pack the relevant information for an OpenMP depend clause. More... | |
struct | FinalizationInfo |
struct | LocationDescription |
Description of a LLVM-IR insertion point (IP) and a debug/source location (filename, line, column, ...). More... | |
struct | MapInfosTy |
This structure contains combined information generated for mappable clauses, including base pointers, pointers, sizes, map types, user-defined mappers, and non-contiguous information. More... | |
struct | MapperAllocas |
struct | OutlineInfo |
Helper that contains information about regions we need to outline during finalization. More... | |
struct | ReductionInfo |
Information about an OpenMP reduction. More... | |
class | TargetDataInfo |
Struct that keeps the information that should be kept throughout a 'target data' region. More... | |
struct | TargetDataRTArgs |
Container for the arguments used to pass data to the runtime library. More... | |
struct | TargetKernelArgs |
Data structure that contains the needed information to construct the kernel args vector. More... | |
Public Types | |
enum class | ReductionGenCBKind { Clang , MLIR } |
Enum class for the RedctionGen CallBack type to be used. More... | |
enum class | EvalKind { Scalar , Complex , Aggregate } |
Enum class for reduction evaluation types scalar, complex and aggregate. More... | |
enum class | CopyAction : unsigned { RemoteLaneToThread , ThreadCopy } |
enum class | DeviceInfoTy { None , Pointer , Address } |
enum | EmitMetadataErrorKind { EMIT_MD_TARGET_REGION_ERROR , EMIT_MD_DECLARE_TARGET_ERROR , EMIT_MD_GLOBAL_VAR_LINK_ERROR } |
The kind of errors that can occur when emitting the offload entries and metadata. More... | |
enum | BodyGenTy { Priv , DupNoPriv , NoPriv } |
Type of BodyGen to use for region codegen. More... | |
using | InsertPointTy = IRBuilder<>::InsertPoint |
Type used throughout for insertion points. | |
using | FinalizeCallbackTy = std::function< void(InsertPointTy CodeGenIP)> |
Callback type for variable finalization (think destructors). | |
using | BodyGenCallbackTy = function_ref< void(InsertPointTy AllocaIP, InsertPointTy CodeGenIP)> |
Callback type for body (=inner region) code generation. | |
using | StorableBodyGenCallbackTy = std::function< void(InsertPointTy AllocaIP, InsertPointTy CodeGenIP)> |
using | LoopBodyGenCallbackTy = function_ref< void(InsertPointTy CodeGenIP, Value *IndVar)> |
Callback type for loop body code generation. | |
using | PrivatizeCallbackTy = function_ref< InsertPointTy(InsertPointTy AllocaIP, InsertPointTy CodeGenIP, Value &Original, Value &Inner, Value *&ReplVal)> |
Callback type for variable privatization (think copy & default constructor). | |
using | FileIdentifierInfoCallbackTy = std::function< std::tuple< std::string, uint64_t >()> |
using | ReductionGenClangCBTy = std::function< InsertPointTy(InsertPointTy CodeGenIP, unsigned Index, Value **LHS, Value **RHS, Function *CurFn)> |
ReductionGen CallBack for Clang. | |
using | ReductionGenCBTy = std::function< InsertPointTy(InsertPointTy CodeGenIP, Value *LHS, Value *RHS, Value *&Res)> |
ReductionGen CallBack for MLIR. | |
using | ReductionGenAtomicCBTy = std::function< InsertPointTy(InsertPointTy, Type *, Value *, Value *)> |
Functions used to generate atomic reductions. | |
using | MapValuesArrayTy = SmallVector< Value *, 4 > |
using | MapDeviceInfoArrayTy = SmallVector< DeviceInfoTy, 4 > |
using | MapFlagsArrayTy = SmallVector< omp::OpenMPOffloadMappingFlags, 4 > |
using | MapNamesArrayTy = SmallVector< Constant *, 4 > |
using | MapDimArrayTy = SmallVector< uint64_t, 4 > |
using | MapNonContiguousArrayTy = SmallVector< MapValuesArrayTy, 4 > |
using | EmitFallbackCallbackTy = function_ref< InsertPointTy(InsertPointTy)> |
Callback function type for functions emitting the host fallback code that is executed when the kernel launch fails. | |
using | EmitMetadataErrorReportFunctionTy = std::function< void(EmitMetadataErrorKind, TargetRegionEntryInfo)> |
Callback function type. | |
using | FunctionGenCallback = std::function< Function *(StringRef FunctionName)> |
Functions used to generate a function with the given name. | |
using | GenMapInfoCallbackTy = function_ref< MapInfosTy &(InsertPointTy CodeGenIP)> |
Callback type for creating the map infos for the kernel parameters. | |
using | TargetBodyGenCallbackTy = function_ref< InsertPointTy(InsertPointTy AllocaIP, InsertPointTy CodeGenIP)> |
using | TargetGenArgAccessorsCallbackTy = function_ref< InsertPointTy(Argument &Arg, Value *Input, Value *&RetVal, InsertPointTy AllocaIP, InsertPointTy CodeGenIP)> |
Public Member Functions | |
OpenMPIRBuilder (Module &M) | |
Create a new OpenMPIRBuilder operating on the given module M . | |
~OpenMPIRBuilder () | |
void | initialize () |
Initialize the internal state, this will put structures types and potentially other helpers into the underlying module. | |
void | setConfig (OpenMPIRBuilderConfig C) |
void | finalize (Function *Fn=nullptr) |
Finalize the underlying module, e.g., by outlining regions. | |
void | addAttributes (omp::RuntimeFunction FnID, Function &Fn) |
Add attributes known for FnID to Fn . | |
std::string | createPlatformSpecificName (ArrayRef< StringRef > Parts) const |
Get the create a name using the platform specific separators. | |
void | pushFinalizationCB (const FinalizationInfo &FI) |
Push a finalization callback on the finalization stack. | |
void | popFinalizationCB () |
Pop the last finalization callback from the finalization stack. | |
InsertPointTy | createBarrier (const LocationDescription &Loc, omp::Directive Kind, bool ForceSimpleCall=false, bool CheckCancelFlag=true) |
Emitter methods for OpenMP directives. | |
InsertPointTy | createCancel (const LocationDescription &Loc, Value *IfCondition, omp::Directive CanceledDirective) |
Generator for '#omp cancel'. | |
IRBuilder ::InsertPoint | createParallel (const LocationDescription &Loc, InsertPointTy AllocaIP, BodyGenCallbackTy BodyGenCB, PrivatizeCallbackTy PrivCB, FinalizeCallbackTy FiniCB, Value *IfCondition, Value *NumThreads, omp::ProcBindKind ProcBind, bool IsCancellable) |
Generator for '#omp parallel'. | |
CanonicalLoopInfo * | createCanonicalLoop (const LocationDescription &Loc, LoopBodyGenCallbackTy BodyGenCB, Value *TripCount, const Twine &Name="loop") |
Generator for the control flow structure of an OpenMP canonical loop. | |
CanonicalLoopInfo * | createCanonicalLoop (const LocationDescription &Loc, LoopBodyGenCallbackTy BodyGenCB, Value *Start, Value *Stop, Value *Step, bool IsSigned, bool InclusiveStop, InsertPointTy ComputeIP={}, const Twine &Name="loop") |
Generator for the control flow structure of an OpenMP canonical loop. | |
CanonicalLoopInfo * | collapseLoops (DebugLoc DL, ArrayRef< CanonicalLoopInfo * > Loops, InsertPointTy ComputeIP) |
Collapse a loop nest into a single loop. | |
Constant * | getAddrOfDeclareTargetVar (OffloadEntriesInfoManager::OMPTargetGlobalVarEntryKind CaptureClause, OffloadEntriesInfoManager::OMPTargetDeviceClauseKind DeviceClause, bool IsDeclaration, bool IsExternallyVisible, TargetRegionEntryInfo EntryInfo, StringRef MangledName, std::vector< GlobalVariable * > &GeneratedRefs, bool OpenMPSIMD, std::vector< Triple > TargetTriple, Type *LlvmPtrTy, std::function< Constant *()> GlobalInitializer, std::function< GlobalValue::LinkageTypes()> VariableLinkage) |
Retrieve (or create if non-existent) the address of a declare target variable, used in conjunction with registerTargetGlobalVariable to create declare target global variables. | |
void | registerTargetGlobalVariable (OffloadEntriesInfoManager::OMPTargetGlobalVarEntryKind CaptureClause, OffloadEntriesInfoManager::OMPTargetDeviceClauseKind DeviceClause, bool IsDeclaration, bool IsExternallyVisible, TargetRegionEntryInfo EntryInfo, StringRef MangledName, std::vector< GlobalVariable * > &GeneratedRefs, bool OpenMPSIMD, std::vector< Triple > TargetTriple, std::function< Constant *()> GlobalInitializer, std::function< GlobalValue::LinkageTypes()> VariableLinkage, Type *LlvmPtrTy, Constant *Addr) |
Registers a target variable for device or host. | |
unsigned | getFlagMemberOffset () |
Get the offset of the OMP_MAP_MEMBER_OF field. | |
omp::OpenMPOffloadMappingFlags | getMemberOfFlag (unsigned Position) |
Get OMP_MAP_MEMBER_OF flag with extra bits reserved based on the position given. | |
void | setCorrectMemberOfFlag (omp::OpenMPOffloadMappingFlags &Flags, omp::OpenMPOffloadMappingFlags MemberOfFlag) |
Given an initial flag set, this function modifies it to contain the passed in MemberOfFlag generated from the getMemberOfFlag function. | |
InsertPointTy | applyWorkshareLoop (DebugLoc DL, CanonicalLoopInfo *CLI, InsertPointTy AllocaIP, bool NeedsBarrier, llvm::omp::ScheduleKind SchedKind=llvm::omp::OMP_SCHEDULE_Default, Value *ChunkSize=nullptr, bool HasSimdModifier=false, bool HasMonotonicModifier=false, bool HasNonmonotonicModifier=false, bool HasOrderedClause=false, omp::WorksharingLoopType LoopType=omp::WorksharingLoopType::ForStaticLoop) |
Modifies the canonical loop to be a workshare loop. | |
std::vector< CanonicalLoopInfo * > | tileLoops (DebugLoc DL, ArrayRef< CanonicalLoopInfo * > Loops, ArrayRef< Value * > TileSizes) |
Tile a loop nest. | |
void | unrollLoopFull (DebugLoc DL, CanonicalLoopInfo *Loop) |
Fully unroll a loop. | |
void | unrollLoopHeuristic (DebugLoc DL, CanonicalLoopInfo *Loop) |
Fully or partially unroll a loop. | |
void | unrollLoopPartial (DebugLoc DL, CanonicalLoopInfo *Loop, int32_t Factor, CanonicalLoopInfo **UnrolledCLI) |
Partially unroll a loop. | |
void | applySimd (CanonicalLoopInfo *Loop, MapVector< Value *, Value * > AlignedVars, Value *IfCond, omp::OrderKind Order, ConstantInt *Simdlen, ConstantInt *Safelen) |
Add metadata to simd-ize a loop. | |
void | createFlush (const LocationDescription &Loc) |
Generator for '#omp flush'. | |
void | createTaskwait (const LocationDescription &Loc) |
Generator for '#omp taskwait'. | |
void | createTaskyield (const LocationDescription &Loc) |
Generator for '#omp taskyield'. | |
InsertPointTy | createTask (const LocationDescription &Loc, InsertPointTy AllocaIP, BodyGenCallbackTy BodyGenCB, bool Tied=true, Value *Final=nullptr, Value *IfCondition=nullptr, SmallVector< DependData > Dependencies={}) |
Generator for #omp task | |
InsertPointTy | createTaskgroup (const LocationDescription &Loc, InsertPointTy AllocaIP, BodyGenCallbackTy BodyGenCB) |
Generator for the taskgroup construct. | |
InsertPointTy | createReductionsGPU (const LocationDescription &Loc, InsertPointTy AllocaIP, InsertPointTy CodeGenIP, ArrayRef< ReductionInfo > ReductionInfos, bool IsNoWait=false, bool IsTeamsReduction=false, bool HasDistribute=false, ReductionGenCBKind ReductionGenCBKind=ReductionGenCBKind::MLIR, std::optional< omp::GV > GridValue={}, unsigned ReductionBufNum=1024, Value *SrcLocInfo=nullptr) |
Design of OpenMP reductions on the GPU. | |
InsertPointTy | createReductions (const LocationDescription &Loc, InsertPointTy AllocaIP, ArrayRef< ReductionInfo > ReductionInfos, ArrayRef< bool > IsByRef, bool IsNoWait=false) |
Generator for '#omp reduction'. | |
InsertPointTy | getInsertionPoint () |
} | |
bool | updateToLocation (const LocationDescription &Loc) |
Update the internal location to Loc . | |
FunctionCallee | getOrCreateRuntimeFunction (Module &M, omp::RuntimeFunction FnID) |
Return the function declaration for the runtime function with FnID . | |
Function * | getOrCreateRuntimeFunctionPtr (omp::RuntimeFunction FnID) |
Constant * | getOrCreateSrcLocStr (StringRef LocStr, uint32_t &SrcLocStrSize) |
Return the (LLVM-IR) string describing the source location LocStr . | |
Constant * | getOrCreateDefaultSrcLocStr (uint32_t &SrcLocStrSize) |
Return the (LLVM-IR) string describing the default source location. | |
Constant * | getOrCreateSrcLocStr (StringRef FunctionName, StringRef FileName, unsigned Line, unsigned Column, uint32_t &SrcLocStrSize) |
Return the (LLVM-IR) string describing the source location identified by the arguments. | |
Constant * | getOrCreateSrcLocStr (DebugLoc DL, uint32_t &SrcLocStrSize, Function *F=nullptr) |
Return the (LLVM-IR) string describing the DebugLoc DL . | |
Constant * | getOrCreateSrcLocStr (const LocationDescription &Loc, uint32_t &SrcLocStrSize) |
Return the (LLVM-IR) string describing the source location Loc . | |
Constant * | getOrCreateIdent (Constant *SrcLocStr, uint32_t SrcLocStrSize, omp::IdentFlag Flags=omp::IdentFlag(0), unsigned Reserve2Flags=0) |
Return an ident_t* encoding the source location SrcLocStr and Flags . | |
GlobalValue * | createGlobalFlag (unsigned Value, StringRef Name) |
Create a hidden global flag Name in the module with initial value Value . | |
void | emitCancelationCheckImpl (Value *CancelFlag, omp::Directive CanceledDirective, FinalizeCallbackTy ExitCB={}) |
Generate control flow and cleanup for cancellation. | |
InsertPointTy | emitTargetKernel (const LocationDescription &Loc, InsertPointTy AllocaIP, Value *&Return, Value *Ident, Value *DeviceID, Value *NumTeams, Value *NumThreads, Value *HostPtr, ArrayRef< Value * > KernelArgs) |
Generate a target region entry call. | |
void | emitFlush (const LocationDescription &Loc) |
Generate a flush runtime call. | |
bool | isLastFinalizationInfoCancellable (omp::Directive DK) |
Return true if the last entry in the finalization stack is of kind DK and cancellable. | |
void | emitTaskwaitImpl (const LocationDescription &Loc) |
Generate a taskwait runtime call. | |
void | emitTaskyieldImpl (const LocationDescription &Loc) |
Generate a taskyield runtime call. | |
Value * | getOrCreateThreadID (Value *Ident) |
Return the current thread ID. | |
void | addOutlineInfo (OutlineInfo &&OI) |
Add a new region that will be outlined later. | |
Value * | getSizeInBytes (Value *BasePtr) |
Computes the size of type in bytes. | |
void | emitBranch (BasicBlock *Target) |
void | emitBlock (BasicBlock *BB, Function *CurFn, bool IsFinished=false) |
void | emitIfClause (Value *Cond, BodyGenCallbackTy ThenGen, BodyGenCallbackTy ElseGen, InsertPointTy AllocaIP={}) |
Emits code for OpenMP 'if' clause using specified BodyGenCallbackTy Here is the logic: if (Cond) { ThenGen(); } else { ElseGen(); }. | |
GlobalVariable * | createOffloadMaptypes (SmallVectorImpl< uint64_t > &Mappings, std::string VarName) |
Create the global variable holding the offload mappings information. | |
GlobalVariable * | createOffloadMapnames (SmallVectorImpl< llvm::Constant * > &Names, std::string VarName) |
Create the global variable holding the offload names information. | |
void | createMapperAllocas (const LocationDescription &Loc, InsertPointTy AllocaIP, unsigned NumOperands, struct MapperAllocas &MapperAllocas) |
Create the allocas instruction used in call to mapper functions. | |
void | emitMapperCall (const LocationDescription &Loc, Function *MapperFunc, Value *SrcLocInfo, Value *MaptypesArg, Value *MapnamesArg, struct MapperAllocas &MapperAllocas, int64_t DeviceID, unsigned NumOperands) |
Create the call for the target mapper function. | |
InsertPointTy | emitKernelLaunch (const LocationDescription &Loc, Function *OutlinedFn, Value *OutlinedFnID, EmitFallbackCallbackTy EmitTargetCallFallbackCB, TargetKernelArgs &Args, Value *DeviceID, Value *RTLoc, InsertPointTy AllocaIP) |
Generate a target region entry call and host fallback call. | |
InsertPointTy | emitTargetTask (Function *OutlinedFn, Value *OutlinedFnID, EmitFallbackCallbackTy EmitTargetCallFallbackCB, TargetKernelArgs &Args, Value *DeviceID, Value *RTLoc, InsertPointTy AllocaIP, SmallVector< OpenMPIRBuilder::DependData > &Dependencies, bool HasNoWait) |
Generate a target-task for the target construct. | |
void | emitOffloadingArraysArgument (IRBuilderBase &Builder, OpenMPIRBuilder::TargetDataRTArgs &RTArgs, OpenMPIRBuilder::TargetDataInfo &Info, bool ForEndCall=false) |
Emit the arguments to be passed to the runtime library based on the arrays of base pointers, pointers, sizes, map types, and mappers. | |
void | emitNonContiguousDescriptor (InsertPointTy AllocaIP, InsertPointTy CodeGenIP, MapInfosTy &CombinedInfo, TargetDataInfo &Info) |
Emit an array of struct descriptors to be assigned to the offload args. | |
void | emitOffloadingArrays (InsertPointTy AllocaIP, InsertPointTy CodeGenIP, MapInfosTy &CombinedInfo, TargetDataInfo &Info, bool IsNonContiguous=false, function_ref< void(unsigned int, Value *)> DeviceAddrCB=nullptr, function_ref< Value *(unsigned int)> CustomMapperCB=nullptr) |
Emit the arrays used to pass the captures and map information to the offloading runtime library. | |
void | emitOffloadingArraysAndArgs (InsertPointTy AllocaIP, InsertPointTy CodeGenIP, TargetDataInfo &Info, TargetDataRTArgs &RTArgs, MapInfosTy &CombinedInfo, bool IsNonContiguous=false, bool ForEndCall=false, function_ref< void(unsigned int, Value *)> DeviceAddrCB=nullptr, function_ref< Value *(unsigned int)> CustomMapperCB=nullptr) |
Allocates memory for and populates the arrays required for offloading (offload_{baseptrs|ptrs|mappers|sizes|maptypes|mapnames}). | |
void | createOffloadEntry (Constant *ID, Constant *Addr, uint64_t Size, int32_t Flags, GlobalValue::LinkageTypes, StringRef Name="") |
Creates offloading entry for the provided entry ID ID, address Addr, size Size, and flags Flags. | |
void | createOffloadEntriesAndInfoMetadata (EmitMetadataErrorReportFunctionTy &ErrorReportFunction) |
InsertPointTy | createCopyPrivate (const LocationDescription &Loc, llvm::Value *BufSize, llvm::Value *CpyBuf, llvm::Value *CpyFn, llvm::Value *DidIt) |
Generator for __kmpc_copyprivate. | |
InsertPointTy | createSingle (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB, bool IsNowait, ArrayRef< llvm::Value * > CPVars={}, ArrayRef< llvm::Function * > CPFuncs={}) |
Generator for '#omp single'. | |
InsertPointTy | createMaster (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB) |
Generator for '#omp master'. | |
InsertPointTy | createMasked (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB, Value *Filter) |
Generator for '#omp masked'. | |
InsertPointTy | createCritical (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB, StringRef CriticalName, Value *HintInst) |
Generator for '#omp critical'. | |
InsertPointTy | createOrderedDepend (const LocationDescription &Loc, InsertPointTy AllocaIP, unsigned NumLoops, ArrayRef< llvm::Value * > StoreValues, const Twine &Name, bool IsDependSource) |
Generator for '#omp ordered depend (source | sink)'. | |
InsertPointTy | createOrderedThreadsSimd (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB, bool IsThreads) |
Generator for '#omp ordered [threads | simd]'. | |
InsertPointTy | createSections (const LocationDescription &Loc, InsertPointTy AllocaIP, ArrayRef< StorableBodyGenCallbackTy > SectionCBs, PrivatizeCallbackTy PrivCB, FinalizeCallbackTy FiniCB, bool IsCancellable, bool IsNowait) |
Generator for '#omp sections'. | |
InsertPointTy | createSection (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB) |
Generator for '#omp section'. | |
InsertPointTy | createTeams (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, Value *NumTeamsLower=nullptr, Value *NumTeamsUpper=nullptr, Value *ThreadLimit=nullptr, Value *IfExpr=nullptr) |
Generator for #omp teams | |
InsertPointTy | createCopyinClauseBlocks (InsertPointTy IP, Value *MasterAddr, Value *PrivateAddr, llvm::IntegerType *IntPtrTy, bool BranchtoEnd=true) |
Generate conditional branch and relevant BasicBlocks through which private threads copy the 'copyin' variables from Master copy to threadprivate copies. | |
CallInst * | createOMPAlloc (const LocationDescription &Loc, Value *Size, Value *Allocator, std::string Name="") |
Create a runtime call for kmpc_Alloc. | |
CallInst * | createOMPFree (const LocationDescription &Loc, Value *Addr, Value *Allocator, std::string Name="") |
Create a runtime call for kmpc_free. | |
CallInst * | createCachedThreadPrivate (const LocationDescription &Loc, llvm::Value *Pointer, llvm::ConstantInt *Size, const llvm::Twine &Name=Twine("")) |
Create a runtime call for kmpc_threadprivate_cached. | |
CallInst * | createOMPInteropInit (const LocationDescription &Loc, Value *InteropVar, omp::OMPInteropType InteropType, Value *Device, Value *NumDependences, Value *DependenceAddress, bool HaveNowaitClause) |
Create a runtime call for __tgt_interop_init. | |
CallInst * | createOMPInteropDestroy (const LocationDescription &Loc, Value *InteropVar, Value *Device, Value *NumDependences, Value *DependenceAddress, bool HaveNowaitClause) |
Create a runtime call for __tgt_interop_destroy. | |
CallInst * | createOMPInteropUse (const LocationDescription &Loc, Value *InteropVar, Value *Device, Value *NumDependences, Value *DependenceAddress, bool HaveNowaitClause) |
Create a runtime call for __tgt_interop_use. | |
InsertPointTy | createTargetInit (const LocationDescription &Loc, bool IsSPMD, int32_t MinThreadsVal=0, int32_t MaxThreadsVal=0, int32_t MinTeamsVal=0, int32_t MaxTeamsVal=0) |
The omp target interface. | |
void | createTargetDeinit (const LocationDescription &Loc, int32_t TeamsReductionDataSize=0, int32_t TeamsReductionBufferLength=1024) |
Create a runtime call for kmpc_target_deinit. | |
void | emitTargetRegionFunction (TargetRegionEntryInfo &EntryInfo, FunctionGenCallback &GenerateFunctionCallback, bool IsOffloadEntry, Function *&OutlinedFn, Constant *&OutlinedFnID) |
Create a unique name for the entry function using the source location information of the current target region. | |
Constant * | registerTargetRegionFunction (TargetRegionEntryInfo &EntryInfo, Function *OutlinedFunction, StringRef EntryFnName, StringRef EntryFnIDName) |
Registers the given function and sets up the attribtues of the function Returns the FunctionID. | |
OpenMPIRBuilder::InsertPointTy | createTargetData (const LocationDescription &Loc, InsertPointTy AllocaIP, InsertPointTy CodeGenIP, Value *DeviceID, Value *IfCond, TargetDataInfo &Info, GenMapInfoCallbackTy GenMapInfoCB, omp::RuntimeFunction *MapperFunc=nullptr, function_ref< InsertPointTy(InsertPointTy CodeGenIP, BodyGenTy BodyGenType)> BodyGenCB=nullptr, function_ref< void(unsigned int, Value *)> DeviceAddrCB=nullptr, function_ref< Value *(unsigned int)> CustomMapperCB=nullptr, Value *SrcLocInfo=nullptr) |
Generator for '#omp target data'. | |
InsertPointTy | createTarget (const LocationDescription &Loc, bool IsOffloadEntry, OpenMPIRBuilder::InsertPointTy AllocaIP, OpenMPIRBuilder::InsertPointTy CodeGenIP, TargetRegionEntryInfo &EntryInfo, ArrayRef< int32_t > NumTeams, ArrayRef< int32_t > NumThreads, SmallVectorImpl< Value * > &Inputs, GenMapInfoCallbackTy GenMapInfoCB, TargetBodyGenCallbackTy BodyGenCB, TargetGenArgAccessorsCallbackTy ArgAccessorFuncCB, SmallVector< DependData > Dependencies={}) |
Generator for '#omp target'. | |
FunctionCallee | createForStaticInitFunction (unsigned IVSize, bool IVSigned, bool IsGPUDistribute) |
Returns __kmpc_for_static_init_* runtime function for the specified size IVSize and sign IVSigned. | |
FunctionCallee | createDispatchInitFunction (unsigned IVSize, bool IVSigned) |
Returns __kmpc_dispatch_init_* runtime function for the specified size IVSize and sign IVSigned. | |
FunctionCallee | createDispatchNextFunction (unsigned IVSize, bool IVSigned) |
Returns __kmpc_dispatch_next_* runtime function for the specified size IVSize and sign IVSigned. | |
FunctionCallee | createDispatchFiniFunction (unsigned IVSize, bool IVSigned) |
Returns __kmpc_dispatch_fini_* runtime function for the specified size IVSize and sign IVSigned. | |
FunctionCallee | createDispatchDeinitFunction () |
Returns __kmpc_dispatch_deinit runtime function. | |
InsertPointTy | createAtomicRead (const LocationDescription &Loc, AtomicOpValue &X, AtomicOpValue &V, AtomicOrdering AO) |
Emit atomic Read for : V = X — Only Scalar data types. | |
InsertPointTy | createAtomicWrite (const LocationDescription &Loc, AtomicOpValue &X, Value *Expr, AtomicOrdering AO) |
Emit atomic write for : X = Expr — Only Scalar data types. | |
InsertPointTy | createAtomicUpdate (const LocationDescription &Loc, InsertPointTy AllocaIP, AtomicOpValue &X, Value *Expr, AtomicOrdering AO, AtomicRMWInst::BinOp RMWOp, AtomicUpdateCallbackTy &UpdateOp, bool IsXBinopExpr) |
Emit atomic update for constructs: X = X BinOp Expr ,or X = Expr BinOp X For complex Operations: X = UpdateOp(X) => CmpExch X, old_X, UpdateOp(X) Only Scalar data types. | |
InsertPointTy | createAtomicCapture (const LocationDescription &Loc, InsertPointTy AllocaIP, AtomicOpValue &X, AtomicOpValue &V, Value *Expr, AtomicOrdering AO, AtomicRMWInst::BinOp RMWOp, AtomicUpdateCallbackTy &UpdateOp, bool UpdateExpr, bool IsPostfixUpdate, bool IsXBinopExpr) |
Emit atomic update for constructs: — Only Scalar data types V = X; X = X BinOp Expr , X = X BinOp Expr; V = X, V = X; X = Expr BinOp X, X = Expr BinOp X; V = X, V = X; X = UpdateOp(X), X = UpdateOp(X); V = X,. | |
InsertPointTy | createAtomicCompare (const LocationDescription &Loc, AtomicOpValue &X, AtomicOpValue &V, AtomicOpValue &R, Value *E, Value *D, AtomicOrdering AO, omp::OMPAtomicCompareOp Op, bool IsXBinopExpr, bool IsPostfixUpdate, bool IsFailOnly) |
Emit atomic compare for constructs: — Only scalar data types cond-expr-stmt: x = x ordop expr ? expr : x; x = expr ordop x ? expr : x; x = x == e ? d : x; x = e == x ? d : x; (this one is not in the spec) cond-update-stmt: if (x ordop expr) { x = expr; } if (expr ordop x) { x = expr; } if (x == e) { x = d; } if (e == x) { x = d; } (this one is not in the spec) conditional-update-capture-atomic: v = x; cond-update-stmt; (IsPostfixUpdate=true, IsFailOnly=false) cond-update-stmt; v = x; (IsPostfixUpdate=false, IsFailOnly=false) if (x == e) { x = d; } else { v = x; } (IsPostfixUpdate=false, IsFailOnly=true) r = x == e; if (r) { x = d; } (IsPostfixUpdate=false, IsFailOnly=false) r = x == e; if (r) { x = d; } else { v = x; } (IsPostfixUpdate=false, IsFailOnly=true) | |
InsertPointTy | createAtomicCompare (const LocationDescription &Loc, AtomicOpValue &X, AtomicOpValue &V, AtomicOpValue &R, Value *E, Value *D, AtomicOrdering AO, omp::OMPAtomicCompareOp Op, bool IsXBinopExpr, bool IsPostfixUpdate, bool IsFailOnly, AtomicOrdering Failure) |
CanonicalLoopInfo * | createLoopSkeleton (DebugLoc DL, Value *TripCount, Function *F, BasicBlock *PreInsertBefore, BasicBlock *PostInsertBefore, const Twine &Name={}) |
Create the control flow structure of a canonical OpenMP loop. | |
void | loadOffloadInfoMetadata (Module &M) |
Loads all the offload entries information from the host IR metadata. | |
void | loadOffloadInfoMetadata (StringRef HostFilePath) |
Loads all the offload entries information from the host IR metadata read from the file passed in as the HostFilePath argument. | |
GlobalVariable * | getOrCreateInternalVariable (Type *Ty, const StringRef &Name, unsigned AddressSpace=0) |
Gets (if variable with the given name already exist) or creates internal global variable with the specified Name. | |
Static Public Member Functions | |
static unsigned | getOpenMPDefaultSimdAlign (const Triple &TargetTriple, const StringMap< bool > &Features) |
Get the default alignment value for given target. | |
static TargetRegionEntryInfo | getTargetEntryUniqueInfo (FileIdentifierInfoCallbackTy CallBack, StringRef ParentName="") |
Creates a unique info for a target entry when provided a filename and line number from. | |
static void | getKernelArgsVector (TargetKernelArgs &KernelArgs, IRBuilderBase &Builder, SmallVector< Value * > &ArgsVector) |
Create the kernel args vector used by emitTargetKernel. | |
static std::pair< int32_t, int32_t > | readThreadBoundsForKernel (const Triple &T, Function &Kernel) |
} | |
static void | writeThreadBoundsForKernel (const Triple &T, Function &Kernel, int32_t LB, int32_t UB) |
static std::pair< int32_t, int32_t > | readTeamBoundsForKernel (const Triple &T, Function &Kernel) |
Read/write a bounds on teams for Kernel . | |
static void | writeTeamsForKernel (const Triple &T, Function &Kernel, int32_t LB, int32_t UB) |
Public Attributes | |
SmallVector< FinalizationInfo, 8 > | FinalizationStack |
The finalization stack made up of finalize callbacks currently in-flight, wrapped into FinalizationInfo objects that reference also the finalization target block and the kind of cancellable directive. | |
OpenMPIRBuilderConfig | Config |
The OpenMPIRBuilder Configuration. | |
Module & | M |
The underlying LLVM-IR module. | |
IRBuilder | Builder |
The LLVM-IR Builder used to create IR. | |
StringMap< Constant * > | SrcLocStrMap |
Map to remember source location strings. | |
DenseMap< std::pair< Constant *, uint64_t >, Constant * > | IdentMap |
Map to remember existing ident_t*. | |
OffloadEntriesInfoManager | OffloadInfoManager |
Info manager to keep track of target regions. | |
const Triple | T |
The target triple of the underlying module. | |
SmallVector< OutlineInfo, 16 > | OutlineInfos |
Collection of regions that need to be outlined during finalization. | |
SmallVector< llvm::Function *, 16 > | ConstantAllocaRaiseCandidates |
A collection of candidate target functions that's constant allocas will attempt to be raised on a call of finalize after all currently enqueued outline info's have been processed. | |
std::forward_list< CanonicalLoopInfo > | LoopInfos |
Collection of owned canonical loop objects that eventually need to be free'd. | |
StringMap< GlobalVariable *, BumpPtrAllocator > | InternalVars |
An ordered map of auto-generated variables to their unique names. | |
const std::string | ompOffloadInfoName = "omp_offload.info" |
OMP Offload Info Metadata name string. | |
An interface to create LLVM-IR for OpenMP directives.
Each OpenMP directive has a corresponding public generator method.
Definition at line 473 of file OMPIRBuilder.h.
using llvm::OpenMPIRBuilder::BodyGenCallbackTy = function_ref<void(InsertPointTy AllocaIP, InsertPointTy CodeGenIP)> |
Callback type for body (=inner region) code generation.
The callback takes code locations as arguments, each describing a location where additional instructions can be inserted.
The CodeGenIP may be in the middle of a basic block or point to the end of it. The basic block may have a terminator or be degenerate. The callback function may just insert instructions at that position, but also split the block (without the Before argument of BasicBlock::splitBasicBlock such that the identify of the split predecessor block is preserved) and insert additional control flow, including branches that do not lead back to what follows the CodeGenIP. Note that since the callback is allowed to split the block, callers must assume that InsertPoints to positions in the BasicBlock after CodeGenIP including CodeGenIP itself are invalidated. If such InsertPoints need to be preserved, it can split the block itself before calling the callback.
AllocaIP and CodeGenIP must not point to the same position.
AllocaIP | is the insertion point at which new alloca instructions should be placed. The BasicBlock it is pointing to must not be split. |
CodeGenIP | is the insertion point at which the body code should be placed. |
Definition at line 570 of file OMPIRBuilder.h.
Callback function type for functions emitting the host fallback code that is executed when the kernel launch fails.
It takes an insertion point as parameter where the code should be emitted. It returns an insertion point that points right after after the emitted code.
Definition at line 2318 of file OMPIRBuilder.h.
using llvm::OpenMPIRBuilder::EmitMetadataErrorReportFunctionTy = std::function<void(EmitMetadataErrorKind, TargetRegionEntryInfo)> |
Callback function type.
Definition at line 2410 of file OMPIRBuilder.h.
using llvm::OpenMPIRBuilder::FileIdentifierInfoCallbackTy = std::function<std::tuple<std::string, uint64_t>()> |
Definition at line 1247 of file OMPIRBuilder.h.
using llvm::OpenMPIRBuilder::FinalizeCallbackTy = std::function<void(InsertPointTy CodeGenIP)> |
Callback type for variable finalization (think destructors).
CodeGenIP | is the insertion point at which the finalization code should be placed. |
A finalize callback knows about all objects that need finalization, e.g. destruction, when the scope of the currently generated construct is left at the time, and location, the callback is invoked.
Definition at line 519 of file OMPIRBuilder.h.
using llvm::OpenMPIRBuilder::FunctionGenCallback = std::function<Function *(StringRef FunctionName)> |
Functions used to generate a function with the given name.
Definition at line 2734 of file OMPIRBuilder.h.
using llvm::OpenMPIRBuilder::GenMapInfoCallbackTy = function_ref<MapInfosTy &(InsertPointTy CodeGenIP)> |
Callback type for creating the map infos for the kernel parameters.
CodeGenIP | is the insertion point where code should be generated, if any. |
Definition at line 2801 of file OMPIRBuilder.h.
using llvm::OpenMPIRBuilder::InsertPointTy = IRBuilder<>::InsertPoint |
Type used throughout for insertion points.
Definition at line 499 of file OMPIRBuilder.h.
using llvm::OpenMPIRBuilder::LoopBodyGenCallbackTy = function_ref<void(InsertPointTy CodeGenIP, Value *IndVar)> |
Callback type for loop body code generation.
CodeGenIP | is the insertion point where the loop's body code must be placed. This will be a dedicated BasicBlock with a conditional branch from the loop condition check and terminated with an unconditional branch to the loop latch. |
IndVar | is the induction variable usable at the insertion point. |
Definition at line 588 of file OMPIRBuilder.h.
Definition at line 2268 of file OMPIRBuilder.h.
using llvm::OpenMPIRBuilder::MapDimArrayTy = SmallVector<uint64_t, 4> |
Definition at line 2271 of file OMPIRBuilder.h.
Definition at line 2269 of file OMPIRBuilder.h.
using llvm::OpenMPIRBuilder::MapNamesArrayTy = SmallVector<Constant *, 4> |
Definition at line 2270 of file OMPIRBuilder.h.
Definition at line 2272 of file OMPIRBuilder.h.
using llvm::OpenMPIRBuilder::MapValuesArrayTy = SmallVector<Value *, 4> |
Definition at line 2267 of file OMPIRBuilder.h.
using llvm::OpenMPIRBuilder::PrivatizeCallbackTy = function_ref<InsertPointTy( InsertPointTy AllocaIP, InsertPointTy CodeGenIP, Value &Original, Value &Inner, Value *&ReplVal)> |
Callback type for variable privatization (think copy & default constructor).
AllocaIP | is the insertion point at which new alloca instructions should be placed. |
CodeGenIP | is the insertion point at which the privatization code should be placed. |
Original | The value being copied/created, should not be used in the generated IR. |
Inner | The equivalent of Original that should be used in the generated IR; this is equal to Original if the value is a pointer and can thus be passed directly, otherwise it is an equivalent but different value. |
ReplVal | The replacement value, thus a copy or new created version of Inner . |
ReplVal
the replacement value. Definition at line 609 of file OMPIRBuilder.h.
using llvm::OpenMPIRBuilder::ReductionGenAtomicCBTy = std::function<InsertPointTy(InsertPointTy, Type *, Value *, Value *)> |
Functions used to generate atomic reductions.
Such functions take two Values representing pointers to LHS and RHS of the reduction, as well as the element type of these pointers. They are expected to atomically update the LHS to the reduced value.
Definition at line 1290 of file OMPIRBuilder.h.
using llvm::OpenMPIRBuilder::ReductionGenCBTy = std::function<InsertPointTy( InsertPointTy CodeGenIP, Value *LHS, Value *RHS, Value *&Res)> |
using llvm::OpenMPIRBuilder::ReductionGenClangCBTy = std::function<InsertPointTy(InsertPointTy CodeGenIP, unsigned Index, Value **LHS, Value **RHS, Function *CurFn)> |
ReductionGen CallBack for Clang.
CodeGenIP | InsertPoint for CodeGen. |
Index | Index of the ReductionInfo to generate code for. |
LHSPtr | Optionally used by Clang to return the LHSPtr it used for codegen, used for fixup later. |
RHSPtr | Optionally used by Clang to return the RHSPtr it used for codegen, used for fixup later. |
CurFn | Optionally used by Clang to pass in the Current Function as Clang context may be old. |
Definition at line 1274 of file OMPIRBuilder.h.
using llvm::OpenMPIRBuilder::StorableBodyGenCallbackTy = std::function<void(InsertPointTy AllocaIP, InsertPointTy CodeGenIP)> |
Definition at line 577 of file OMPIRBuilder.h.
using llvm::OpenMPIRBuilder::TargetBodyGenCallbackTy = function_ref<InsertPointTy( InsertPointTy AllocaIP, InsertPointTy CodeGenIP)> |
Definition at line 2833 of file OMPIRBuilder.h.
using llvm::OpenMPIRBuilder::TargetGenArgAccessorsCallbackTy = function_ref<InsertPointTy( Argument &Arg, Value *Input, Value *&RetVal, InsertPointTy AllocaIP, InsertPointTy CodeGenIP)> |
Definition at line 2836 of file OMPIRBuilder.h.
Type of BodyGen to use for region codegen.
Priv: If device pointer privatization is required, emit the body of the region here. It will have to be duplicated: with and without privatization. DupNoPriv: If we need device pointer privatization, we need to emit the body of the region with no privatization in the 'else' branch of the conditional. NoPriv: If we don't require privatization of device pointers, we emit the body in between the runtime calls. This avoids duplicating the body code.
Enumerator | |
---|---|
Priv | |
DupNoPriv | |
NoPriv |
Definition at line 2796 of file OMPIRBuilder.h.
|
strong |
Enumerator | |
---|---|
RemoteLaneToThread | |
ThreadCopy |
Definition at line 1340 of file OMPIRBuilder.h.
|
strong |
Enumerator | |
---|---|
None | |
Pointer | |
Address |
Definition at line 2266 of file OMPIRBuilder.h.
The kind of errors that can occur when emitting the offload entries and metadata.
Enumerator | |
---|---|
EMIT_MD_TARGET_REGION_ERROR | |
EMIT_MD_DECLARE_TARGET_ERROR | |
EMIT_MD_GLOBAL_VAR_LINK_ERROR |
Definition at line 2403 of file OMPIRBuilder.h.
|
strong |
Enum class for reduction evaluation types scalar, complex and aggregate.
Enumerator | |
---|---|
Scalar | |
Complex | |
Aggregate |
Definition at line 1294 of file OMPIRBuilder.h.
|
strong |
Enum class for the RedctionGen CallBack type to be used.
Enumerator | |
---|---|
Clang | |
MLIR |
Definition at line 1262 of file OMPIRBuilder.h.
|
inline |
Create a new OpenMPIRBuilder operating on the given module M
.
This will not have an effect on M
(see initialize)
Definition at line 477 of file OMPIRBuilder.h.
OpenMPIRBuilder::~OpenMPIRBuilder | ( | ) |
Definition at line 818 of file OMPIRBuilder.cpp.
References assert(), and OutlineInfos.
void OpenMPIRBuilder::addAttributes | ( | omp::RuntimeFunction | FnID, |
Function & | Fn | ||
) |
Add attributes known for FnID
to Fn
.
Definition at line 535 of file OMPIRBuilder.cpp.
References llvm::AttributeSet::addAttribute(), llvm::AttributeSet::addAttributes(), llvm::Function::arg_size(), assert(), llvm::SmallVectorImpl< T >::emplace_back(), llvm::Function::getAttributes(), and llvm::Function::getContext().
Referenced by getOrCreateRuntimeFunction().
|
inline |
Add a new region that will be outlined later.
Definition at line 2088 of file OMPIRBuilder.h.
References OutlineInfos.
Referenced by createParallel(), createTask(), createTeams(), and emitTargetTask().
void OpenMPIRBuilder::applySimd | ( | CanonicalLoopInfo * | Loop, |
MapVector< Value *, Value * > | AlignedVars, | ||
Value * | IfCond, | ||
omp::OrderKind | Order, | ||
ConstantInt * | Simdlen, | ||
ConstantInt * | Safelen | ||
) |
Add metadata to simd-ize a loop.
If IfCond is not nullptr, the loop is cloned. The metadata which prevents vectorization is added to to the cloned loop. The cloned loop is executed when ifCond is evaluated to false.
Loop | The loop to simd-ize. |
AlignedVars | The map which containts pairs of the pointer and its corresponding alignment. |
IfCond | The value which corresponds to the if clause condition. |
Order | The enum to map order clause. |
Simdlen | The Simdlen length to apply to the simd loop. |
Safelen | The Safelen length to apply to the simd loop. |
Definition at line 5175 of file OMPIRBuilder.cpp.
References addBasicBlockMetadata(), addLoopMetadata(), addSimdMetadata(), assert(), llvm::Block, Builder, llvm::IRBuilderBase::CreateAlignmentAssumption(), F, FAM, llvm::ConstantAsMetadata::get(), llvm::MDNode::get(), llvm::MDString::get(), llvm::CanonicalLoopInfo::getCond(), llvm::IRBuilderBase::getContext(), llvm::MDNode::getDistinct(), llvm::ConstantInt::getFalse(), llvm::CanonicalLoopInfo::getFunction(), llvm::CanonicalLoopInfo::getHeader(), llvm::Type::getInt1Ty(), llvm::CanonicalLoopInfo::getLatch(), llvm::LoopInfoBase< BlockT, LoopT >::getLoopFor(), llvm::CanonicalLoopInfo::getPreheader(), llvm::BasicBlock::getTerminator(), llvm::ConstantInt::getTrue(), llvm::SmallSet< T, N, C >::insert(), llvm::ValueMap< KeyT, ValueT, Config >::lookup(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::AnalysisManager< IRUnitT, ExtraArgTs >::registerPass(), llvm::IRBuilderBase::restoreIP(), llvm::LoopAnalysis::run(), llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), and llvm::MapVector< KeyT, ValueT, MapType, VectorType >::size().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::applyWorkshareLoop | ( | DebugLoc | DL, |
CanonicalLoopInfo * | CLI, | ||
InsertPointTy | AllocaIP, | ||
bool | NeedsBarrier, | ||
llvm::omp::ScheduleKind | SchedKind = llvm::omp::OMP_SCHEDULE_Default , |
||
Value * | ChunkSize = nullptr , |
||
bool | HasSimdModifier = false , |
||
bool | HasMonotonicModifier = false , |
||
bool | HasNonmonotonicModifier = false , |
||
bool | HasOrderedClause = false , |
||
omp::WorksharingLoopType | LoopType = omp::WorksharingLoopType::ForStaticLoop |
||
) |
Modifies the canonical loop to be a workshare loop.
This takes a LoopInfo
representing a canonical loop, such as the one created by createCanonicalLoop
and emits additional instructions to turn it into a workshare loop. In particular, it calls to an OpenMP runtime function in the preheader to obtain the loop bounds to be used in the current thread, updates the relevant instructions in the canonical loop and calls to an OpenMP runtime finalization function after the loop.
The concrete transformation is done by applyStaticWorkshareLoop, applyStaticChunkedWorkshareLoop, or applyDynamicWorkshareLoop, depending on the value of SchedKind
and ChunkSize
.
DL | Debug location for instructions added for the workshare-loop construct itself. |
CLI | A descriptor of the canonical loop to workshare. |
AllocaIP | An insertion point for Alloca instructions usable in the preheader of the loop. |
NeedsBarrier | Indicates whether a barrier must be insterted after the loop. |
SchedKind | Scheduling algorithm to use. |
ChunkSize | The chunk size for the inner loop. |
HasSimdModifier | Whether the simd modifier is present in the schedule clause. |
HasMonotonicModifier | Whether the monotonic modifier is present in the schedule clause. |
HasNonmonotonicModifier | Whether the nonmonotonic modifier is present in the schedule clause. |
HasOrderedClause | Whether the (parameterless) ordered clause is present. |
LoopType | Information about type of loop worksharing. It corresponds to type of loop workshare OpenMP pragma. |
Definition at line 4454 of file OMPIRBuilder.cpp.
References assert(), computeOpenMPScheduleType(), Config, DL, llvm::OpenMPIRBuilderConfig::isTargetDevice(), and llvm_unreachable.
CanonicalLoopInfo * OpenMPIRBuilder::collapseLoops | ( | DebugLoc | DL, |
ArrayRef< CanonicalLoopInfo * > | Loops, | ||
InsertPointTy | ComputeIP | ||
) |
Collapse a loop nest into a single loop.
Merges loops of a loop nest into a single CanonicalLoopNest representation that has the same number of innermost loop iterations as the origin loop nest. The induction variables of the input loops are derived from the collapsed loop's induction variable. This is intended to be used to implement OpenMP's collapse clause. Before applying a directive, collapseLoops normalizes a loop nest to contain only a single loop and the directive's implementation does not need to handle multiple loops itself. This does not remove the need to handle all loop nest handling by directives, such as the ordered(<n>) clause or the simd schedule-clause modifier of the worksharing-loop directive.
Example:
After collapsing with Loops={i,j}, the loop is changed to
In the current implementation, the following limitations apply:
ComputeIP
. Non-rectangular loops are not yet supported.DL | Debug location for instructions added for collapsing, such as instructions to compute/derive the input loop's induction variables. |
Loops | Loops in the loop nest to collapse. Loops are specified from outermost-to-innermost and every control flow of a loop's body must pass through its directly nested loop. |
ComputeIP | Where additional instruction that compute the collapsed trip count. If not set, defaults to before the generated loop. |
Definition at line 4715 of file OMPIRBuilder.cpp.
References assert(), Builder, createLoopSkeleton(), llvm::IRBuilderBase::CreateMul(), llvm::IRBuilderBase::CreateUDiv(), llvm::IRBuilderBase::CreateURem(), DL, F, llvm::CanonicalLoopInfo::getAfter(), llvm::CanonicalLoopInfo::getBody(), llvm::CanonicalLoopInfo::getLatch(), llvm::ilist_node_with_parent< NodeTy, ParentTy, Options >::getNextNode(), llvm::BasicBlock::getParent(), llvm::CanonicalLoopInfo::getPreheader(), llvm::CanonicalLoopInfo::getPreheaderIP(), llvm::IRBuilderBase::InsertPoint::isSet(), Loops, redirectAllPredecessorsTo(), redirectTo(), removeUnusedBlocksFromParent(), llvm::SmallVectorImpl< T >::reserve(), llvm::SmallVectorImpl< T >::resize(), llvm::IRBuilderBase::restoreIP(), and llvm::IRBuilderBase::SetCurrentDebugLocation().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicCapture | ( | const LocationDescription & | Loc, |
InsertPointTy | AllocaIP, | ||
AtomicOpValue & | X, | ||
AtomicOpValue & | V, | ||
Value * | Expr, | ||
AtomicOrdering | AO, | ||
AtomicRMWInst::BinOp | RMWOp, | ||
AtomicUpdateCallbackTy & | UpdateOp, | ||
bool | UpdateExpr, | ||
bool | IsPostfixUpdate, | ||
bool | IsXBinopExpr | ||
) |
Emit atomic update for constructs: — Only Scalar data types V = X; X = X BinOp Expr , X = X BinOp Expr; V = X, V = X; X = Expr BinOp X, X = Expr BinOp X; V = X, V = X; X = UpdateOp(X), X = UpdateOp(X); V = X,.
Loc | The insert and source location description. |
AllocaIP | The insertion point to be used for alloca instructions. |
X | The target atomic pointer to be updated |
V | Memory address where to store captured value |
Expr | The value to update X with. |
AO | Atomic ordering of the generated atomic instructions |
RMWOp | The binary operation used for update. If operation is not supported by atomicRMW, or belong to {FADD, FSUB, BAD_BINOP}. Then a cmpExch based atomic will be generated. |
UpdateOp | Code generator for complex expressions that cannot be expressed through atomicrmw instruction. |
UpdateExpr | true if X is an in place update of the form X = X BinOp Expr or X = Expr BinOp X |
IsXBinopExpr | true if X is Left H.S. in Right H.S. part of the update expression, false otherwise. (e.g. true for X = X BinOp Expr) |
IsPostfixUpdate | true if original value of 'x' must be stored in 'v', not an updated one. |
Definition at line 8004 of file OMPIRBuilder.cpp.
References assert(), Builder, llvm::IRBuilderBase::CreateStore(), llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::Type::isFloatingPointTy(), llvm::Type::isIntegerTy(), llvm::Type::isPointerTy(), LLVM_DEBUG, llvm::AtomicRMWInst::Max, llvm::AtomicRMWInst::Min, llvm::IRBuilderBase::saveIP(), updateToLocation(), X, and llvm::AtomicRMWInst::Xchg.
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicCompare | ( | const LocationDescription & | Loc, |
AtomicOpValue & | X, | ||
AtomicOpValue & | V, | ||
AtomicOpValue & | R, | ||
Value * | E, | ||
Value * | D, | ||
AtomicOrdering | AO, | ||
omp::OMPAtomicCompareOp | Op, | ||
bool | IsXBinopExpr, | ||
bool | IsPostfixUpdate, | ||
bool | IsFailOnly | ||
) |
Emit atomic compare for constructs: — Only scalar data types cond-expr-stmt: x = x ordop expr ? expr : x; x = expr ordop x ? expr : x; x = x == e ? d : x; x = e == x ? d : x; (this one is not in the spec) cond-update-stmt: if (x ordop expr) { x = expr; } if (expr ordop x) { x = expr; } if (x == e) { x = d; } if (e == x) { x = d; } (this one is not in the spec) conditional-update-capture-atomic: v = x; cond-update-stmt; (IsPostfixUpdate=true, IsFailOnly=false) cond-update-stmt; v = x; (IsPostfixUpdate=false, IsFailOnly=false) if (x == e) { x = d; } else { v = x; } (IsPostfixUpdate=false, IsFailOnly=true) r = x == e; if (r) { x = d; } (IsPostfixUpdate=false, IsFailOnly=false) r = x == e; if (r) { x = d; } else { v = x; } (IsPostfixUpdate=false, IsFailOnly=true)
Loc | The insert and source location description. |
X | The target atomic pointer to be updated. |
V | Memory address where to store captured value (for compare capture only). |
R | Memory address where to store comparison result (for compare capture with '==' only). |
E | The expected value ('e') for forms that use an equality comparison or an expression ('expr') for forms that use 'ordop' (logically an atomic maximum or minimum). |
D | The desired value for forms that use an equality comparison. If forms that use 'ordop', it should be nullptr . |
AO | Atomic ordering of the generated atomic instructions. |
Op | Atomic compare operation. It can only be ==, <, or >. |
IsXBinopExpr | True if the conditional statement is in the form where x is on LHS. It only matters for < or >. |
IsPostfixUpdate | True if original value of 'x' must be stored in 'v', not an updated one (for compare capture only). |
IsFailOnly | True if the original value of 'x' is stored to 'v' only when the comparison fails. This is only valid for the case the comparison is '=='. |
Definition at line 8038 of file OMPIRBuilder.cpp.
References createAtomicCompare(), D, llvm::AtomicCmpXchgInst::getStrongestFailureOrdering(), and X.
Referenced by createAtomicCompare().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicCompare | ( | const LocationDescription & | Loc, |
AtomicOpValue & | X, | ||
AtomicOpValue & | V, | ||
AtomicOpValue & | R, | ||
Value * | E, | ||
Value * | D, | ||
AtomicOrdering | AO, | ||
omp::OMPAtomicCompareOp | Op, | ||
bool | IsXBinopExpr, | ||
bool | IsPostfixUpdate, | ||
bool | IsFailOnly, | ||
AtomicOrdering | Failure | ||
) |
Definition at line 8049 of file OMPIRBuilder.cpp.
References assert(), Builder, llvm::IRBuilderBase::CreateAtomicCmpXchg(), llvm::IRBuilderBase::CreateAtomicRMW(), llvm::IRBuilderBase::CreateBitCast(), llvm::IRBuilderBase::CreateBr(), llvm::IRBuilderBase::CreateCmp(), llvm::IRBuilderBase::CreateCondBr(), llvm::IRBuilderBase::CreateExtractValue(), llvm::IRBuilderBase::CreateSelect(), llvm::IRBuilderBase::CreateSExt(), llvm::IRBuilderBase::CreateStore(), llvm::IRBuilderBase::CreateUnreachable(), llvm::IRBuilderBase::CreateZExt(), D, llvm::Instruction::eraseFromParent(), llvm::CmpInst::FCMP_OGT, llvm::CmpInst::FCMP_OLT, llvm::AtomicRMWInst::FMax, llvm::AtomicRMWInst::FMin, llvm::IntegerType::get(), llvm::Module::getContext(), llvm::IRBuilderBase::GetInsertBlock(), llvm::BasicBlock::getTerminator(), llvm::Value::getType(), llvm::CmpInst::ICMP_SGT, llvm::CmpInst::ICMP_SLT, llvm::CmpInst::ICMP_UGT, llvm::CmpInst::ICMP_ULT, llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::Type::isIntegerTy(), llvm_unreachable, M, llvm::AtomicRMWInst::Max, llvm::AtomicRMWInst::Min, llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), llvm::BasicBlock::splitBasicBlock(), llvm::AtomicRMWInst::UMax, llvm::AtomicRMWInst::UMin, updateToLocation(), and X.
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicRead | ( | const LocationDescription & | Loc, |
AtomicOpValue & | X, | ||
AtomicOpValue & | V, | ||
AtomicOrdering | AO | ||
) |
Emit atomic Read for : V = X — Only Scalar data types.
Loc | The insert and source location description. |
X | The target pointer to be atomically read |
V | Memory address where to store atomically read value |
AO | Atomic ordering of the generated atomic instructions. |
Definition at line 7775 of file OMPIRBuilder.cpp.
References assert(), Builder, llvm::IRBuilderBase::CreateBitCast(), llvm::IRBuilderBase::CreateIntToPtr(), llvm::IRBuilderBase::CreateLoad(), llvm::IRBuilderBase::CreateStore(), llvm::IntegerType::get(), llvm::Module::getContext(), llvm::Type::getScalarSizeInBits(), llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::Type::isFloatingPointTy(), llvm::Type::isIntegerTy(), llvm::Type::isPointerTy(), M, llvm::IRBuilderBase::saveIP(), llvm::LoadInst::setAtomic(), updateToLocation(), and X.
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicUpdate | ( | const LocationDescription & | Loc, |
InsertPointTy | AllocaIP, | ||
AtomicOpValue & | X, | ||
Value * | Expr, | ||
AtomicOrdering | AO, | ||
AtomicRMWInst::BinOp | RMWOp, | ||
AtomicUpdateCallbackTy & | UpdateOp, | ||
bool | IsXBinopExpr | ||
) |
Emit atomic update for constructs: X = X BinOp Expr ,or X = Expr BinOp X For complex Operations: X = UpdateOp(X) => CmpExch X, old_X, UpdateOp(X) Only Scalar data types.
Loc | The insert and source location description. |
AllocaIP | The insertion point to be used for alloca instructions. |
X | The target atomic pointer to be updated |
Expr | The value to update X with. |
AO | Atomic ordering of the generated atomic instructions. |
RMWOp | The binary operation used for update. If operation is not supported by atomicRMW, or belong to {FADD, FSUB, BAD_BINOP}. Then a cmpExch based atomic will be generated. |
UpdateOp | Code generator for complex expressions that cannot be expressed through atomicrmw instruction. |
IsXBinopExpr | true if X is Left H.S. in Right H.S. part of the update expression, false otherwise. (e.g. true for X = X BinOp Expr) |
Definition at line 7844 of file OMPIRBuilder.cpp.
References assert(), Builder, llvm::OpenMPIRBuilder::LocationDescription::IP, isConflictIP(), llvm::Type::isFloatingPointTy(), llvm::Type::isIntegerTy(), llvm::Type::isPointerTy(), LLVM_DEBUG, llvm::AtomicRMWInst::Max, llvm::AtomicRMWInst::Min, llvm::IRBuilderBase::saveIP(), llvm::AtomicRMWInst::UMax, llvm::AtomicRMWInst::UMin, updateToLocation(), and X.
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicWrite | ( | const LocationDescription & | Loc, |
AtomicOpValue & | X, | ||
Value * | Expr, | ||
AtomicOrdering | AO | ||
) |
Emit atomic write for : X = Expr — Only Scalar data types.
Loc | The insert and source location description. |
X | The target pointer to be atomically written to |
Expr | The value to store. |
AO | Atomic ordering of the generated atomic instructions. |
Definition at line 7814 of file OMPIRBuilder.cpp.
References assert(), Builder, llvm::IRBuilderBase::CreateBitCast(), llvm::IRBuilderBase::CreateStore(), llvm::IntegerType::get(), llvm::Module::getContext(), llvm::Type::getScalarSizeInBits(), llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::Type::isFloatingPointTy(), llvm::Type::isIntegerTy(), llvm::Type::isPointerTy(), M, llvm::IRBuilderBase::saveIP(), llvm::StoreInst::setAtomic(), updateToLocation(), and X.
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createBarrier | ( | const LocationDescription & | Loc, |
omp::Directive | Kind, | ||
bool | ForceSimpleCall = false , |
||
bool | CheckCancelFlag = true |
||
) |
Emitter methods for OpenMP directives.
{ Generator for '#omp barrier'
Loc | The location where the barrier directive was encountered. |
Kind | The kind of directive that caused the barrier. |
ForceSimpleCall | Flag to force a simple (=non-cancellation) barrier. |
CheckCancelFlag | Flag to indicate a cancel barrier return value should be checked and acted upon. |
ThreadID | Optional parameter to pass in any existing ThreadID value. |
Definition at line 948 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateCall(), emitCancelationCheckImpl(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::OpenMPIRBuilder::LocationDescription::IP, isLastFinalizationInfoCancellable(), llvm::IRBuilderBase::saveIP(), and updateToLocation().
Referenced by createCancel(), and createSingle().
CallInst * OpenMPIRBuilder::createCachedThreadPrivate | ( | const LocationDescription & | Loc, |
llvm::Value * | Pointer, | ||
llvm::ConstantInt * | Size, | ||
const llvm::Twine & | Name = Twine("") |
||
) |
Create a runtime call for kmpc_threadprivate_cached.
Loc | The insert and source location description. |
Pointer | pointer to data to be cached |
Size | size of data to be cached |
Name | Name of call Instruction for callinst |
Definition at line 5988 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateInternalVariable(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), Name, Pointer, Size, and updateToLocation().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createCancel | ( | const LocationDescription & | Loc, |
Value * | IfCondition, | ||
omp::Directive | CanceledDirective | ||
) |
Generator for '#omp cancel'.
Loc | The location where the directive was encountered. |
IfCondition | The evaluated 'if' clause expression, if any. |
CanceledDirective | The kind of directive that is cancled. |
Definition at line 1000 of file OMPIRBuilder.cpp.
References Builder, createBarrier(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateUnreachable(), llvm::OpenMPIRBuilder::LocationDescription::DL, emitCancelationCheckImpl(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::OpenMPIRBuilder::LocationDescription::IP, llvm_unreachable, llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), llvm::SplitBlockAndInsertIfThenElse(), and updateToLocation().
CanonicalLoopInfo * OpenMPIRBuilder::createCanonicalLoop | ( | const LocationDescription & | Loc, |
LoopBodyGenCallbackTy | BodyGenCB, | ||
Value * | Start, | ||
Value * | Stop, | ||
Value * | Step, | ||
bool | IsSigned, | ||
bool | InclusiveStop, | ||
InsertPointTy | ComputeIP = {} , |
||
const Twine & | Name = "loop" |
||
) |
Generator for the control flow structure of an OpenMP canonical loop.
Instead of a logical iteration space, this allows specifying user-defined loop counter values using increment, upper- and lower bounds. To disambiguate the terminology when counting downwards, instead of lower bounds we use Start
for the loop counter value in the first body iteration.
Consider the following limitations:
A loop counter space over all integer values of its bit-width cannot be represented. E.g using uint8_t, its loop trip count of 256 cannot be stored into an 8 bit integer):
DO I = 0, 255, 1
Unsigned wrapping is only supported when wrapping only "once"; E.g. effectively counting downwards:
for (uint8_t i = 100u; i > 0; i += 127u)
TODO: May need to add additional parameters to represent:
Sign of the step and the comparison operator might disagree:
for (int i = 0; i < 42; i -= 1u)
Loc | The insert and source location description. |
BodyGenCB | Callback that will generate the loop body code. |
Start | Value of the loop counter for the first iterations. |
Stop | Loop counter values past this will stop the loop. |
Step | Loop counter increment after each iteration; negative means counting down. |
IsSigned | Whether Start, Stop and Step are signed integers. |
InclusiveStop | Whether Stop itself is a valid value for the loop counter. |
ComputeIP | Insertion point for instructions computing the trip count. Can be used to ensure the trip count is available at the outermost loop of a loop nest. If not set, defaults to the preheader of the generated loop. |
Name | Base name used to derive BB and instruction names. |
Step
of INT_MIN cannot not be normalized to a positive direction:Definition at line 3907 of file OMPIRBuilder.cpp.
References assert(), Builder, llvm::IRBuilderBase::CreateAdd(), createCanonicalLoop(), llvm::IRBuilderBase::CreateICmp(), llvm::IRBuilderBase::CreateICmpSLT(), llvm::IRBuilderBase::CreateMul(), llvm::IRBuilderBase::CreateNeg(), llvm::IRBuilderBase::CreateSelect(), llvm::IRBuilderBase::CreateSub(), llvm::IRBuilderBase::CreateUDiv(), llvm::OpenMPIRBuilder::LocationDescription::DL, llvm::Value::getType(), llvm::CmpInst::ICMP_SLE, llvm::CmpInst::ICMP_SLT, llvm::CmpInst::ICMP_ULE, llvm::CmpInst::ICMP_ULT, llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::IRBuilderBase::InsertPoint::isSet(), IV, Name, llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), and updateToLocation().
CanonicalLoopInfo * OpenMPIRBuilder::createCanonicalLoop | ( | const LocationDescription & | Loc, |
LoopBodyGenCallbackTy | BodyGenCB, | ||
Value * | TripCount, | ||
const Twine & | Name = "loop" |
||
) |
Generator for the control flow structure of an OpenMP canonical loop.
This generator operates on the logical iteration space of the loop, i.e. the caller only has to provide a loop trip count of the loop as defined by base language semantics. The trip count is interpreted as an unsigned integer. The induction variable passed to BodyGenCB
will be of the same type and run from 0 to TripCount
- 1. It is up to the callback to convert the logical iteration variable to the loop counter variable in the loop body.
Loc | The insert and source location description. The insert location can be between two instructions or the end of a degenerate block (e.g. a BB under construction). |
BodyGenCB | Callback that will generate the loop body code. |
TripCount | Number of iterations the loop body is executed. |
Name | Base name used to derive BB and instruction names. |
Definition at line 3878 of file OMPIRBuilder.cpp.
References After, llvm::CanonicalLoopInfo::assertOK(), Builder, llvm::IRBuilderBase::CreateBr(), createLoopSkeleton(), llvm::OpenMPIRBuilder::LocationDescription::DL, llvm::CanonicalLoopInfo::getAfter(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::CanonicalLoopInfo::getBodyIP(), llvm::CanonicalLoopInfo::getIndVar(), llvm::ilist_node_with_parent< NodeTy, ParentTy, Options >::getNextNode(), llvm::BasicBlock::getParent(), llvm::CanonicalLoopInfo::getPreheader(), llvm::OpenMPIRBuilder::LocationDescription::IP, Name, llvm::spliceBB(), and updateToLocation().
Referenced by createCanonicalLoop(), and createSections().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createCopyinClauseBlocks | ( | InsertPointTy | IP, |
Value * | MasterAddr, | ||
Value * | PrivateAddr, | ||
llvm::IntegerType * | IntPtrTy, | ||
bool | BranchtoEnd = true |
||
) |
Generate conditional branch and relevant BasicBlocks through which private threads copy the 'copyin' variables from Master copy to threadprivate copies.
IP | insertion block for copyin conditional |
MasterVarPtr | a pointer to the master variable |
PrivateVarPtr | a pointer to the threadprivate variable |
IntPtrTy | Pointer size type |
BranchtoEnd | Create a branch between the copyin.not.master blocks |
Definition at line 5822 of file OMPIRBuilder.cpp.
References Builder, llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateBr(), llvm::IRBuilderBase::CreateCondBr(), llvm::IRBuilderBase::CreateICmpNE(), llvm::IRBuilderBase::CreatePtrToInt(), llvm::Instruction::eraseFromParent(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::Module::getContext(), llvm::BasicBlock::getParent(), llvm::BasicBlock::getTerminator(), llvm::IRBuilderBase::InsertPoint::isSet(), M, llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), and llvm::BasicBlock::splitBasicBlock().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createCopyPrivate | ( | const LocationDescription & | Loc, |
llvm::Value * | BufSize, | ||
llvm::Value * | CpyBuf, | ||
llvm::Value * | CpyFn, | ||
llvm::Value * | DidIt | ||
) |
Generator for __kmpc_copyprivate.
Loc | The source location description. |
BufSize | Number of elements in the buffer. |
CpyBuf | List of pointers to data to be copied. |
CpyFn | function to call for copying data. |
DidIt | flag variable; 1 for 'single' thread, 0 otherwise. |
Definition at line 5513 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateLoad(), llvm::IRBuilderBase::getInt32Ty(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::IRBuilderBase::saveIP(), and updateToLocation().
Referenced by createSingle().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createCritical | ( | const LocationDescription & | Loc, |
BodyGenCallbackTy | BodyGenCB, | ||
FinalizeCallbackTy | FiniCB, | ||
StringRef | CriticalName, | ||
Value * | HintInst | ||
) |
Generator for '#omp critical'.
Loc | The insert and source location description. |
BodyGenCB | Callback that will generate the region body code. |
FiniCB | Callback to finalize variable copies. |
CriticalName | name of the lock used by the critical directive |
HintInst | Hint Instruction for hint clause associated with critical |
Definition at line 5599 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::SmallVectorTemplateBase< T, bool >::push_back(), and updateToLocation().
FunctionCallee OpenMPIRBuilder::createDispatchDeinitFunction | ( | ) |
Returns __kmpc_dispatch_deinit runtime function.
Definition at line 6551 of file OMPIRBuilder.cpp.
References getOrCreateRuntimeFunction(), and M.
FunctionCallee OpenMPIRBuilder::createDispatchFiniFunction | ( | unsigned | IVSize, |
bool | IVSigned | ||
) |
Returns __kmpc_dispatch_fini_* runtime function for the specified size IVSize and sign IVSigned.
Definition at line 6538 of file OMPIRBuilder.cpp.
References assert(), getOrCreateRuntimeFunction(), M, and Name.
FunctionCallee OpenMPIRBuilder::createDispatchInitFunction | ( | unsigned | IVSize, |
bool | IVSigned | ||
) |
Returns __kmpc_dispatch_init_* runtime function for the specified size IVSize and sign IVSigned.
Definition at line 6512 of file OMPIRBuilder.cpp.
References assert(), getOrCreateRuntimeFunction(), M, and Name.
FunctionCallee OpenMPIRBuilder::createDispatchNextFunction | ( | unsigned | IVSize, |
bool | IVSigned | ||
) |
Returns __kmpc_dispatch_next_* runtime function for the specified size IVSize and sign IVSigned.
Definition at line 6525 of file OMPIRBuilder.cpp.
References assert(), getOrCreateRuntimeFunction(), M, and Name.
void OpenMPIRBuilder::createFlush | ( | const LocationDescription & | Loc | ) |
Generator for '#omp flush'.
Loc | The location where the flush directive was encountered |
Definition at line 1679 of file OMPIRBuilder.cpp.
References emitFlush(), and updateToLocation().
FunctionCallee OpenMPIRBuilder::createForStaticInitFunction | ( | unsigned | IVSize, |
bool | IVSigned, | ||
bool | IsGPUDistribute | ||
) |
Returns __kmpc_for_static_init_* runtime function for the specified size IVSize and sign IVSigned.
Will create a distribute call __kmpc_distribute_static_init* if IsGPUDistribute is set.
Definition at line 6492 of file OMPIRBuilder.cpp.
References assert(), getOrCreateRuntimeFunction(), M, and Name.
GlobalValue * OpenMPIRBuilder::createGlobalFlag | ( | unsigned | Value, |
StringRef | Name | ||
) |
Create a hidden global flag Name
in the module with initial value Value
.
Definition at line 822 of file OMPIRBuilder.cpp.
References llvm::Module::getContext(), llvm::Type::getInt32Ty(), llvm::GlobalValue::HiddenVisibility, M, Name, and llvm::GlobalValue::WeakODRLinkage.
CanonicalLoopInfo * OpenMPIRBuilder::createLoopSkeleton | ( | DebugLoc | DL, |
Value * | TripCount, | ||
Function * | F, | ||
BasicBlock * | PreInsertBefore, | ||
BasicBlock * | PostInsertBefore, | ||
const Twine & | Name = {} |
||
) |
Create the control flow structure of a canonical OpenMP loop.
The emitted loop will be disconnected, i.e. no edge to the loop's preheader and no terminator in the AfterBB. The OpenMPIRBuilder's IRBuilder location is not preserved.
DL | DebugLoc used for the instructions in the skeleton. |
TripCount | Value to be used for the trip count. |
F | Function in which to insert the BasicBlocks. |
PreInsertBefore | Where to insert BBs that execute before the body, typically the body itself. |
PostInsertBefore | Where to insert BBs that execute after the body. |
Name | Base name used to derive BB and instruction names. |
Definition at line 3811 of file OMPIRBuilder.cpp.
References llvm::PHINode::addIncoming(), After, llvm::CanonicalLoopInfo::assertOK(), Builder, Cond, llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateAdd(), llvm::IRBuilderBase::CreateBr(), llvm::IRBuilderBase::CreateCondBr(), llvm::IRBuilderBase::CreateICmpULT(), llvm::IRBuilderBase::CreatePHI(), DL, F, llvm::Module::getContext(), llvm::Value::getType(), LoopInfos, M, Name, llvm::IRBuilderBase::SetCurrentDebugLocation(), and llvm::IRBuilderBase::SetInsertPoint().
Referenced by collapseLoops(), createCanonicalLoop(), and tileLoops().
void OpenMPIRBuilder::createMapperAllocas | ( | const LocationDescription & | Loc, |
InsertPointTy | AllocaIP, | ||
unsigned | NumOperands, | ||
struct MapperAllocas & | MapperAllocas | ||
) |
Create the allocas instruction used in call to mapper functions.
Definition at line 7276 of file OMPIRBuilder.cpp.
References llvm::OpenMPIRBuilder::MapperAllocas::Args, llvm::OpenMPIRBuilder::MapperAllocas::ArgsBase, llvm::OpenMPIRBuilder::MapperAllocas::ArgSizes, Builder, llvm::IRBuilderBase::CreateAlloca(), llvm::ArrayType::get(), Int64, llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::IRBuilderBase::restoreIP(), and updateToLocation().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createMasked | ( | const LocationDescription & | Loc, |
BodyGenCallbackTy | BodyGenCB, | ||
FinalizeCallbackTy | FiniCB, | ||
Value * | Filter | ||
) |
Generator for '#omp masked'.
Loc | The insert and source location description. |
BodyGenCB | Callback that will generate the region code. |
FiniCB | Callback to finialize variable copies. |
Definition at line 3787 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateCall(), llvm::Filter, getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::OpenMPIRBuilder::LocationDescription::IP, and updateToLocation().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createMaster | ( | const LocationDescription & | Loc, |
BodyGenCallbackTy | BodyGenCB, | ||
FinalizeCallbackTy | FiniCB | ||
) |
Generator for '#omp master'.
Loc | The insert and source location description. |
BodyGenCB | Callback that will generate the region code. |
FiniCB | Callback to finalize variable copies. |
Definition at line 3762 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::OpenMPIRBuilder::LocationDescription::IP, and updateToLocation().
void OpenMPIRBuilder::createOffloadEntriesAndInfoMetadata | ( | EmitMetadataErrorReportFunctionTy & | ErrorReportFunction | ) |
Definition at line 8458 of file OMPIRBuilder.cpp.
References llvm::OffloadEntriesInfoManager::actOnDeviceGlobalVarEntriesInfo(), llvm::OffloadEntriesInfoManager::actOnTargetRegionEntriesInfo(), llvm::NamedMDNode::addOperand(), assert(), Builder, llvm::CallingConv::C, Config, createOffloadEntry(), llvm::dyn_cast(), EMIT_MD_DECLARE_TARGET_ERROR, EMIT_MD_GLOBAL_VAR_LINK_ERROR, EMIT_MD_TARGET_REGION_ERROR, llvm::offloading::emitOffloadingEntry(), llvm::OffloadEntriesInfoManager::empty(), llvm::omp::Flags, llvm::ConstantAsMetadata::get(), llvm::MDNode::get(), llvm::MDString::get(), llvm::Module::getContext(), llvm::IRBuilderBase::getInt32Ty(), llvm::Module::getNamedValue(), llvm::Constant::getNullValue(), llvm::Module::getOrInsertNamedMetadata(), llvm::OpenMPIRBuilderConfig::getRequiresFlags(), llvm::PointerType::getUnqual(), llvm::OpenMPIRBuilderConfig::hasRequiresFlags(), llvm::OpenMPIRBuilderConfig::hasRequiresUnifiedSharedMemory(), llvm::OpenMPIRBuilderConfig::isTargetDevice(), llvm_unreachable, M, OffloadInfoManager, llvm::OffloadEntriesInfoManager::OMPTargetGlobalRegisterRequires, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryEnter, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryIndirect, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryLink, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryTo, llvm::TargetRegionEntryInfo::ParentName, llvm::OffloadEntriesInfoManager::size(), and llvm::GlobalValue::WeakAnyLinkage.
Referenced by finalize().
void OpenMPIRBuilder::createOffloadEntry | ( | Constant * | ID, |
Constant * | Addr, | ||
uint64_t | Size, | ||
int32_t | Flags, | ||
GlobalValue::LinkageTypes | , | ||
StringRef | Name = "" |
||
) |
Creates offloading entry for the provided entry ID ID, address Addr, size Size, and flags Flags.
Definition at line 8422 of file OMPIRBuilder.cpp.
References llvm::Function::addFnAttr(), llvm::NamedMDNode::addOperand(), Addr, Config, llvm::offloading::emitOffloadingEntry(), llvm::omp::Flags, llvm::ConstantAsMetadata::get(), llvm::MDNode::get(), llvm::Attribute::get(), llvm::MDString::get(), llvm::Module::getContext(), llvm::Type::getInt32Ty(), llvm::Module::getOrInsertNamedMetadata(), llvm::GlobalValue::getParent(), llvm::OpenMPIRBuilderConfig::isGPU(), M, Name, and Size.
Referenced by createOffloadEntriesAndInfoMetadata().
GlobalVariable * OpenMPIRBuilder::createOffloadMapnames | ( | SmallVectorImpl< llvm::Constant * > & | Names, |
std::string | VarName | ||
) |
Create the global variable holding the offload names information.
Definition at line 8371 of file OMPIRBuilder.cpp.
References llvm::ConstantArray::get(), llvm::ArrayType::get(), llvm::Module::getContext(), llvm::Value::getType(), llvm::PointerType::getUnqual(), M, llvm::GlobalValue::PrivateLinkage, and llvm::SmallVectorBase< Size_T >::size().
Referenced by emitOffloadingArrays().
GlobalVariable * OpenMPIRBuilder::createOffloadMaptypes | ( | SmallVectorImpl< uint64_t > & | Mappings, |
std::string | VarName | ||
) |
Create the global variable holding the offload mappings information.
Definition at line 7264 of file OMPIRBuilder.cpp.
References llvm::ConstantDataArray::get(), llvm::Module::getContext(), llvm::Value::getType(), llvm::GlobalValue::Global, M, and llvm::GlobalValue::PrivateLinkage.
Referenced by emitOffloadingArrays().
CallInst * OpenMPIRBuilder::createOMPAlloc | ( | const LocationDescription & | Loc, |
Value * | Size, | ||
Value * | Allocator, | ||
std::string | Name = "" |
||
) |
Create a runtime call for kmpc_Alloc.
Loc | The insert and source location description. |
Size | Size of allocated memory space |
Allocator | Allocator information instruction |
Name | Name of call Instruction for OMP_alloc |
Definition at line 5872 of file OMPIRBuilder.cpp.
References Allocator, Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), Name, Size, and updateToLocation().
CallInst * OpenMPIRBuilder::createOMPFree | ( | const LocationDescription & | Loc, |
Value * | Addr, | ||
Value * | Allocator, | ||
std::string | Name = "" |
||
) |
Create a runtime call for kmpc_free.
Loc | The insert and source location description. |
Addr | Address of memory space to be freed |
Allocator | Allocator information instruction |
Name | Name of call Instruction for OMP_Free |
Definition at line 5889 of file OMPIRBuilder.cpp.
References Addr, Allocator, Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), Name, and updateToLocation().
CallInst * OpenMPIRBuilder::createOMPInteropDestroy | ( | const LocationDescription & | Loc, |
Value * | InteropVar, | ||
Value * | Device, | ||
Value * | NumDependences, | ||
Value * | DependenceAddress, | ||
bool | HaveNowaitClause | ||
) |
Create a runtime call for __tgt_interop_destroy.
Loc | The insert and source location description. |
InteropVar | variable to be allocated |
Device | devide to which offloading will occur |
NumDependences | number of dependence variables |
DependenceAddress | pointer to dependence variables |
HaveNowaitClause | does nowait clause exist |
Definition at line 5933 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateCall(), llvm::ConstantPointerNull::get(), llvm::Constant::getAllOnesValue(), llvm::Module::getContext(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::PointerType::getUnqual(), Int32, M, and updateToLocation().
CallInst * OpenMPIRBuilder::createOMPInteropInit | ( | const LocationDescription & | Loc, |
Value * | InteropVar, | ||
omp::OMPInteropType | InteropType, | ||
Value * | Device, | ||
Value * | NumDependences, | ||
Value * | DependenceAddress, | ||
bool | HaveNowaitClause | ||
) |
Create a runtime call for __tgt_interop_init.
Loc | The insert and source location description. |
InteropVar | variable to be allocated |
InteropType | type of interop operation |
Device | devide to which offloading will occur |
NumDependences | number of dependence variables |
DependenceAddress | pointer to dependence variables |
HaveNowaitClause | does nowait clause exist |
Definition at line 5904 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateCall(), llvm::ConstantPointerNull::get(), llvm::Constant::getAllOnesValue(), llvm::Module::getContext(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::PointerType::getUnqual(), Int32, M, and updateToLocation().
CallInst * OpenMPIRBuilder::createOMPInteropUse | ( | const LocationDescription & | Loc, |
Value * | InteropVar, | ||
Value * | Device, | ||
Value * | NumDependences, | ||
Value * | DependenceAddress, | ||
bool | HaveNowaitClause | ||
) |
Create a runtime call for __tgt_interop_use.
Loc | The insert and source location description. |
InteropVar | variable to be allocated |
Device | devide to which offloading will occur |
NumDependences | number of dependence variables |
DependenceAddress | pointer to dependence variables |
HaveNowaitClause | does nowait clause exist |
Definition at line 5960 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateCall(), llvm::ConstantPointerNull::get(), llvm::Constant::getAllOnesValue(), llvm::Module::getContext(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::PointerType::getUnqual(), Int32, M, and updateToLocation().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createOrderedDepend | ( | const LocationDescription & | Loc, |
InsertPointTy | AllocaIP, | ||
unsigned | NumLoops, | ||
ArrayRef< llvm::Value * > | StoreValues, | ||
const Twine & | Name, | ||
bool | IsDependSource | ||
) |
Generator for '#omp ordered depend (source | sink)'.
Loc | The insert and source location description. |
AllocaIP | The insertion point to be used for alloca instructions. |
NumLoops | The number of loops in depend clause. |
StoreValues | The value will be stored in vector address. |
Name | The name of alloca instruction. |
IsDependSource | If true, depend source; otherwise, depend sink. |
Definition at line 5634 of file OMPIRBuilder.cpp.
References llvm::all_of(), assert(), Builder, llvm::IRBuilderBase::CreateAlloca(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateInBoundsGEP(), llvm::IRBuilderBase::CreateStore(), llvm::ArrayType::get(), llvm::IRBuilderBase::getInt64(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::Value::getType(), I, Int64, llvm::OpenMPIRBuilder::LocationDescription::IP, Name, llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), llvm::AllocaInst::setAlignment(), llvm::StoreInst::setAlignment(), and updateToLocation().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createOrderedThreadsSimd | ( | const LocationDescription & | Loc, |
BodyGenCallbackTy | BodyGenCB, | ||
FinalizeCallbackTy | FiniCB, | ||
bool | IsThreads | ||
) |
Generator for '#omp ordered [threads | simd]'.
Loc | The insert and source location description. |
BodyGenCB | Callback that will generate the region code. |
FiniCB | Callback to finalize variable copies. |
IsThreads | If true, with threads clause or without clause; otherwise, with simd clause; |
Definition at line 5680 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::OpenMPIRBuilder::LocationDescription::IP, and updateToLocation().
IRBuilder::InsertPoint OpenMPIRBuilder::createParallel | ( | const LocationDescription & | Loc, |
InsertPointTy | AllocaIP, | ||
BodyGenCallbackTy | BodyGenCB, | ||
PrivatizeCallbackTy | PrivCB, | ||
FinalizeCallbackTy | FiniCB, | ||
Value * | IfCondition, | ||
Value * | NumThreads, | ||
omp::ProcBindKind | ProcBind, | ||
bool | IsCancellable | ||
) |
Generator for '#omp parallel'.
Loc | The insert and source location description. |
AllocaIP | The insertion points to be used for alloca instructions. |
BodyGenCB | Callback that will generate the region code. |
PrivCB | Callback to copy a given variable (think copy constructor). |
FiniCB | Callback to finalize variable copies. |
IfCondition | The evaluated 'if' clause expression, if any. |
NumThreads | The evaluated 'num_threads' clause expression, if any. |
ProcBind | The value of the 'proc_bind' clause (see ProcBindKind). |
IsCancellable | Flag to indicate a cancellable parallel region. |
Definition at line 1357 of file OMPIRBuilder.cpp.
References addOutlineInfo(), assert(), llvm::BasicBlock::begin(), Blocks, Builder, llvm::OpenMPIRBuilder::OutlineInfo::collectBlocks(), Config, llvm::SmallPtrSetImpl< PtrType >::count(), llvm::IRBuilderBase::CreateAlloca(), llvm::IRBuilderBase::CreateBr(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateIntCast(), llvm::IRBuilderBase::CreateLoad(), llvm::IRBuilderBase::CreateStore(), llvm::dbgs(), llvm::SetVector< T, Vector, Set, N >::empty(), llvm::OpenMPIRBuilder::OutlineInfo::EntryBB, llvm::OpenMPIRBuilder::OutlineInfo::ExcludeArgsFromAggregate, llvm::OpenMPIRBuilder::OutlineInfo::ExitBB, FinalizationStack, llvm::CodeExtractor::findAllocas(), llvm::CodeExtractor::findInputsOutputs(), llvm::DataLayout::getAllocaAddrSpace(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::CallBase::getCalledFunction(), llvm::FunctionCallee::getCallee(), llvm::IRBuilderBase::getContext(), llvm::Module::getContext(), llvm::Module::getDataLayout(), llvm::BasicBlock::getFirstInsertionPt(), llvm::IRBuilderBase::GetInsertBlock(), llvm::ilist_node_impl< OptionsT >::getIterator(), llvm::ilist_node_with_parent< NodeTy, ParentTy, Options >::getNextNode(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::ilist_detail::node_parent_access< NodeTy, ParentTy >::getParent(), llvm::BasicBlock::getParent(), llvm::BasicBlock::getTerminator(), hostParallelCallback(), I, llvm::Instruction::insertAfter(), Int32, llvm::OpenMPIRBuilder::LocationDescription::IP, isConflictIP(), llvm::OpenMPIRBuilderConfig::isTargetDevice(), LLVM_DEBUG, M, llvm::OpenMPIRBuilder::OutlineInfo::OuterAllocaBB, llvm::OpenMPIRBuilder::OutlineInfo::PostOutlineCB, Ptr, llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), llvm::Value::setName(), llvm::BasicBlock::splitBasicBlock(), llvm::SplitBlock(), targetParallelCallback(), updateToLocation(), and Uses.
Get the create a name using the platform specific separators.
Parts | parts of the final name that needs separation The created name has a first separator between the first and second part and a second separator between all other parts. E.g. with FirstSeparator "$" and Separator "." and parts: "p1", "p2", "p3", "p4" The resulting name is "p1$p2.p3.p4" The separators are retrieved from the OpenMPIRBuilderConfig. |
Definition at line 7213 of file OMPIRBuilder.cpp.
References Config, llvm::OpenMPIRBuilderConfig::firstSeparator(), and llvm::OpenMPIRBuilderConfig::separator().
Referenced by emitOffloadingArrays(), emitTargetRegionFunction(), and registerTargetGlobalVariable().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createReductions | ( | const LocationDescription & | Loc, |
InsertPointTy | AllocaIP, | ||
ArrayRef< ReductionInfo > | ReductionInfos, | ||
ArrayRef< bool > | IsByRef, | ||
bool | IsNoWait = false |
||
) |
Generator for '#omp reduction'.
Emits the IR instructing the runtime to perform the specific kind of reductions. Expects reduction variables to have been privatized and initialized to reduction-neutral values separately. Emits the calls to runtime functions as well as the reduction function and the basic blocks performing the reduction atomically and non-atomically.
The code emitted for the following:
corresponds to the following sketch.
Loc | The location where the reduction was encountered. Must be within the associate directive and after the last local access to the reduction variables. |
AllocaIP | An insertion point suitable for allocas usable in reductions. |
ReductionInfos | A list of info on each reduction variable. |
IsNoWait | A flag set if the reduction is marked as nowait. |
IsByRef | A flag set if the reduction is using reference or direct value. |
Definition at line 3589 of file OMPIRBuilder.cpp.
References llvm::all_of(), assert(), llvm::OpenMPIRBuilder::ReductionInfo::AtomicReductionGen, Builder, llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateAlloca(), llvm::IRBuilderBase::CreateBitCast(), llvm::IRBuilderBase::CreateBr(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateConstInBoundsGEP2_64(), llvm::IRBuilderBase::CreateLoad(), llvm::IRBuilderBase::CreateRetVoid(), llvm::IRBuilderBase::CreateStore(), llvm::IRBuilderBase::CreateSwitch(), llvm::IRBuilderBase::CreateUnreachable(), DL, llvm::OpenMPIRBuilder::ReductionInfo::ElementType, llvm::BasicBlock::end(), llvm::enumerate(), llvm::Instruction::eraseFromParent(), llvm::ArrayType::get(), llvm::Function::getArg(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::Module::getContext(), llvm::Module::getDataLayout(), getFreshReductionFunc(), llvm::IRBuilderBase::GetInsertBlock(), llvm::IRBuilderBase::getInt32(), llvm::IRBuilderBase::getInt64(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::BasicBlock::getParent(), llvm::IRBuilderBase::InsertPoint::getPoint(), llvm::IRBuilderBase::getPtrTy(), llvm::BasicBlock::getTerminator(), llvm::Value::getType(), llvm::OpenMPIRBuilder::LocationDescription::IP, LHS, llvm::none_of(), P, llvm::OpenMPIRBuilder::ReductionInfo::PrivateVariable, llvm::OpenMPIRBuilder::ReductionInfo::ReductionGen, llvm::IRBuilderBase::restoreIP(), RHS, llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), llvm::ArrayRef< T >::size(), llvm::BasicBlock::splitBasicBlock(), updateToLocation(), and llvm::OpenMPIRBuilder::ReductionInfo::Variable.
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createReductionsGPU | ( | const LocationDescription & | Loc, |
InsertPointTy | AllocaIP, | ||
InsertPointTy | CodeGenIP, | ||
ArrayRef< ReductionInfo > | ReductionInfos, | ||
bool | IsNoWait = false , |
||
bool | IsTeamsReduction = false , |
||
bool | HasDistribute = false , |
||
ReductionGenCBKind | ReductionGenCBKind = ReductionGenCBKind::MLIR , |
||
std::optional< omp::GV > | GridValue = {} , |
||
unsigned | ReductionBufNum = 1024 , |
||
Value * | SrcLocInfo = nullptr |
||
) |
Design of OpenMP reductions on the GPU.
Consider a typical OpenMP program with one or more reduction clauses:
float foo; double bar; #pragma omp target teams distribute parallel for \ reduction(+:foo) reduction(*:bar) for (int i = 0; i < N; i++) { foo += A[i]; bar *= B[i]; }
where 'foo' and 'bar' are reduced across all OpenMP threads in all teams. In our OpenMP implementation on the NVPTX device an OpenMP team is mapped to a CUDA threadblock and OpenMP threads within a team are mapped to CUDA threads within a threadblock. Our goal is to efficiently aggregate values across all OpenMP threads such that:
Introduction to Decoupling
We would like to decouple the compiler and the runtime so that the latter is ignorant of the reduction variables (number, data types) and the reduction operators. This allows a simpler interface and implementation while still attaining good performance.
Pseudocode for the aforementioned OpenMP program generated by the compiler is as follows:
Call the OpenMP runtime on the GPU to reduce within a team and store the result on the team master:
__kmpc_nvptx_parallel_reduce_nowait_v2(..., reduceData, shuffleReduceFn, interWarpCpyFn)
where: struct ReduceData { double *foo; double *bar; } reduceData reduceData.foo = &foo_private reduceData.bar = &bar_private
'shuffleReduceFn' and 'interWarpCpyFn' are pointers to two auxiliary functions generated by the compiler that operate on variables of type 'ReduceData'. They aid the runtime perform algorithmic steps in a data agnostic manner.
'shuffleReduceFn' is a pointer to a function that reduces data of type 'ReduceData' across two OpenMP threads (lanes) in the same warp. It takes the following arguments as input:
a. variable of type 'ReduceData' on the calling lane, b. its lane_id, c. an offset relative to the current lane_id to generate a remote_lane_id. The remote lane contains the second variable of type 'ReduceData' that is to be reduced. d. an algorithm version parameter determining which reduction algorithm to use.
'shuffleReduceFn' retrieves data from the remote lane using efficient GPU shuffle intrinsics and reduces, using the algorithm specified by the 4th parameter, the two operands element-wise. The result is written to the first operand.
Different reduction algorithms are implemented in different runtime functions, all calling 'shuffleReduceFn' to perform the essential reduction step. Therefore, based on the 4th parameter, this function behaves slightly differently to cooperate with the runtime to ensure correctness under different circumstances.
'InterWarpCpyFn' is a pointer to a function that transfers reduced variables across warps. It tunnels, through CUDA shared memory, the thread-private data of type 'ReduceData' from lane 0 of each warp to a lane in the first warp.
Call the OpenMP runtime on the GPU to reduce across teams. The last team writes the global reduced value to memory.
ret = __kmpc_nvptx_teams_reduce_nowait(..., reduceData, shuffleReduceFn, interWarpCpyFn, scratchpadCopyFn, loadAndReduceFn)
'scratchpadCopyFn' is a helper that stores reduced data from the team master to a scratchpad array in global memory.
'loadAndReduceFn' is a helper that loads data from the scratchpad array and reduces it with the input operand.
These compiler generated functions hide address calculation and alignment information from the runtime.
Warp Reduction Algorithms
On the warp level, we have three algorithms implemented in the OpenMP runtime depending on the number of active lanes:
Full Warp Reduction
The reduce algorithm within a warp where all lanes are active is implemented in the runtime as follows:
full_warp_reduce(void *reduce_data, kmp_ShuffleReductFctPtr ShuffleReduceFn) { for (int offset = WARPSIZE/2; offset > 0; offset /= 2) ShuffleReduceFn(reduce_data, 0, offset, 0); }
The algorithm completes in log(2, WARPSIZE) steps.
'ShuffleReduceFn' is used here with lane_id set to 0 because it is not used therefore we save instructions by not retrieving lane_id from the corresponding special registers. The 4th parameter, which represents the version of the algorithm being used, is set to 0 to signify full warp reduction.
In this version, 'ShuffleReduceFn' behaves, per element, as follows:
#reduce_elem refers to an element in the local lane's data structure #remote_elem is retrieved from a remote lane remote_elem = shuffle_down(reduce_elem, offset, WARPSIZE); reduce_elem = reduce_elem REDUCE_OP remote_elem;
Contiguous Partial Warp Reduction
This reduce algorithm is used within a warp where only the first 'n' (n <= WARPSIZE) lanes are active. It is typically used when the number of OpenMP threads in a parallel region is not a multiple of WARPSIZE. The algorithm is implemented in the runtime as follows:
void contiguous_partial_reduce(void *reduce_data, kmp_ShuffleReductFctPtr ShuffleReduceFn, int size, int lane_id) { int curr_size; int offset; curr_size = size; mask = curr_size/2; while (offset>0) { ShuffleReduceFn(reduce_data, lane_id, offset, 1); curr_size = (curr_size+1)/2; offset = curr_size/2; } }
In this version, 'ShuffleReduceFn' behaves, per element, as follows:
remote_elem = shuffle_down(reduce_elem, offset, WARPSIZE); if (lane_id < offset) reduce_elem = reduce_elem REDUCE_OP remote_elem else reduce_elem = remote_elem
This algorithm assumes that the data to be reduced are located in a contiguous subset of lanes starting from the first. When there is an odd number of active lanes, the data in the last lane is not aggregated with any other lane's dat but is instead copied over.
Dispersed Partial Warp Reduction
This algorithm is used within a warp when any discontiguous subset of lanes are active. It is used to implement the reduction operation across lanes in an OpenMP simd region or in a nested parallel region.
void dispersed_partial_reduce(void *reduce_data, kmp_ShuffleReductFctPtr ShuffleReduceFn) { int size, remote_id; int logical_lane_id = number_of_active_lanes_before_me() * 2; do { remote_id = next_active_lane_id_right_after_me();
size = number_of_active_lanes_in_this_warp(); logical_lane_id /= 2; ShuffleReduceFn(reduce_data, logical_lane_id, remote_id-1-threadIdx.x, 2); } while (logical_lane_id % 2 == 0 && size > 1); }
There is no assumption made about the initial state of the reduction. Any number of lanes (>=1) could be active at any position. The reduction result is returned in the first active lane.
In this version, 'ShuffleReduceFn' behaves, per element, as follows:
remote_elem = shuffle_down(reduce_elem, offset, WARPSIZE); if (lane_id % 2 == 0 && offset > 0) reduce_elem = reduce_elem REDUCE_OP remote_elem else reduce_elem = remote_elem
Intra-Team Reduction
This function, as implemented in the runtime call '__kmpc_nvptx_parallel_reduce_nowait_v2', aggregates data across OpenMP threads in a team. It first reduces within a warp using the aforementioned algorithms. We then proceed to gather all such reduced values at the first warp.
The runtime makes use of the function 'InterWarpCpyFn', which copies data from each of the "warp master" (zeroth lane of each warp, where warp-reduced data is held) to the zeroth warp. This step reduces (in a mathematical sense) the problem of reduction across warp masters in a block to the problem of warp reduction.
Inter-Team Reduction
Once a team has reduced its data to a single value, it is stored in a global scratchpad array. Since each team has a distinct slot, this can be done without locking.
The last team to write to the scratchpad array proceeds to reduce the scratchpad array. One or more workers in the last team use the helper 'loadAndReduceDataFn' to load and reduce values from the array, i.e., the k'th worker reduces every k'th element.
Finally, a call is made to '__kmpc_nvptx_parallel_reduce_nowait_v2' to reduce across workers and compute a globally reduced value.
Loc | The location where the reduction was encountered. Must be within the associate directive and after the last local access to the reduction variables. |
AllocaIP | An insertion point suitable for allocas usable in reductions. |
CodeGenIP | An insertion point suitable for code generation. |
ReductionInfos | A list of info on each reduction variable. |
IsNoWait | Optional flag set if the reduction is marked as nowait. |
IsTeamsReduction | Optional flag set if it is a teams reduction. |
HasDistribute | Optional flag set if it is a distribute reduction. |
GridValue | Optional GPU grid value. |
ReductionBufNum | Optional OpenMPCUDAReductionBufNumValue to be used for teams reduction. |
SrcLocInfo | Source location information global. |
Definition at line 3398 of file OMPIRBuilder.cpp.
References llvm::AttrBuilder::addAttribute(), llvm::AttributeList::addFnAttributes(), assert(), Builder, checkReductionInfos(), Clang, Cond, Config, llvm::BasicBlock::Create(), llvm::StructType::create(), llvm::IRBuilderBase::CreateAlloca(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateCondBr(), llvm::IRBuilderBase::CreateICmpEQ(), llvm::IRBuilderBase::CreateInBoundsGEP(), llvm::IRBuilderBase::CreatePointerBitCastOrAddrSpaceCast(), llvm::IRBuilderBase::CreateStore(), emitBlock(), llvm::SmallVectorImpl< T >::emplace_back(), llvm::enumerate(), llvm::ArrayType::get(), llvm::Function::getAttributes(), llvm::Module::getContext(), llvm::Module::getDataLayout(), llvm::DataLayout::getDefaultGlobalsAddressSpace(), llvm::AttributeList::getFnAttrs(), getGridValue(), llvm::IRBuilderBase::getIndexTy(), llvm::IRBuilderBase::GetInsertBlock(), llvm::IRBuilderBase::getInt32(), llvm::IRBuilderBase::getInt64(), llvm::Value::getName(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), llvm::BasicBlock::getParent(), llvm::DataLayout::getTypeStoreSize(), llvm::PointerType::getUnqual(), LHS, M, llvm::OpenMPIRBuilder::ReductionInfo::PrivateVariable, llvm::OpenMPIRBuilder::ReductionInfo::ReductionGenClang, llvm::AttrBuilder::removeAttribute(), llvm::Value::replaceUsesWithIf(), llvm::IRBuilderBase::restoreIP(), RHS, llvm::IRBuilderBase::saveIP(), llvm::OpenMPIRBuilderConfig::setEmitLLVMUsed(), llvm::OpenMPIRBuilderConfig::setGridValue(), llvm::ArrayRef< T >::size(), Size, updateToLocation(), and llvm::OpenMPIRBuilder::ReductionInfo::Variable.
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createSection | ( | const LocationDescription & | Loc, |
BodyGenCallbackTy | BodyGenCB, | ||
FinalizeCallbackTy | FiniCB | ||
) |
Generator for '#omp section'.
Loc | The insert and source location description. |
BodyGenCB | Callback that will generate the region body code. |
FiniCB | Callback to finalize variable copies. |
Definition at line 2165 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateBr(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::BasicBlock::getSinglePredecessor(), llvm::Instruction::getSuccessor(), llvm::BasicBlock::getTerminator(), I, llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::IRBuilderBase::restoreIP(), and updateToLocation().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createSections | ( | const LocationDescription & | Loc, |
InsertPointTy | AllocaIP, | ||
ArrayRef< StorableBodyGenCallbackTy > | SectionCBs, | ||
PrivatizeCallbackTy | PrivCB, | ||
FinalizeCallbackTy | FiniCB, | ||
bool | IsCancellable, | ||
bool | IsNowait | ||
) |
Generator for '#omp sections'.
Loc | The insert and source location description. |
AllocaIP | The insertion points to be used for alloca instructions. |
SectionCBs | Callbacks that will generate body of each section. |
PrivCB | Callback to copy a given variable (think copy constructor). |
FiniCB | Callback to finalize variable copies. |
IsCancellable | Flag to indicate a cancellable parallel region. |
IsNowait | If true, barrier - to ensure all sections are executed before moving forward will not be generated. |
Definition at line 2070 of file OMPIRBuilder.cpp.
References llvm::SwitchInst::addCase(), assert(), llvm::BasicBlock::begin(), Builder, llvm::Continue, llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateBr(), createCanonicalLoop(), llvm::IRBuilderBase::CreateSwitch(), llvm::OpenMPIRBuilder::LocationDescription::DL, FinalizationStack, llvm::Module::getContext(), llvm::IRBuilderBase::getInt32(), llvm::Type::getInt32Ty(), llvm::ilist_node_impl< OptionsT >::getIterator(), llvm::ilist_detail::node_parent_access< NodeTy, ParentTy >::getParent(), I, llvm::OpenMPIRBuilder::LocationDescription::IP, isConflictIP(), M, llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), llvm::ArrayRef< T >::size(), llvm::splitBBWithSuffix(), and updateToLocation().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createSingle | ( | const LocationDescription & | Loc, |
BodyGenCallbackTy | BodyGenCB, | ||
FinalizeCallbackTy | FiniCB, | ||
bool | IsNowait, | ||
ArrayRef< llvm::Value * > | CPVars = {} , |
||
ArrayRef< llvm::Function * > | CPFuncs = {} |
||
) |
Generator for '#omp single'.
Loc | The source location description. |
BodyGenCB | Callback that will generate the region code. |
FiniCB | Callback to finalize variable copies. |
IsNowait | If false, a barrier is emitted. |
CPVars | copyprivate variables. |
CPFuncs | copy functions to use for each copyprivate variable. |
Definition at line 5534 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateAlloca(), createBarrier(), llvm::IRBuilderBase::CreateCall(), createCopyPrivate(), llvm::IRBuilderBase::CreateStore(), llvm::OpenMPIRBuilder::LocationDescription::DL, llvm::ArrayRef< T >::empty(), llvm::IRBuilderBase::getContext(), llvm::IRBuilderBase::getInt32(), llvm::Type::getInt32Ty(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), I, Int64, llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::IRBuilderBase::saveIP(), llvm::ArrayRef< T >::size(), and updateToLocation().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createTarget | ( | const LocationDescription & | Loc, |
bool | IsOffloadEntry, | ||
OpenMPIRBuilder::InsertPointTy | AllocaIP, | ||
OpenMPIRBuilder::InsertPointTy | CodeGenIP, | ||
TargetRegionEntryInfo & | EntryInfo, | ||
ArrayRef< int32_t > | NumTeams, | ||
ArrayRef< int32_t > | NumThreads, | ||
SmallVectorImpl< Value * > & | Inputs, | ||
GenMapInfoCallbackTy | GenMapInfoCB, | ||
OpenMPIRBuilder::TargetBodyGenCallbackTy | CBFunc, | ||
OpenMPIRBuilder::TargetGenArgAccessorsCallbackTy | ArgAccessorFuncCB, | ||
SmallVector< DependData > | Dependencies = {} |
||
) |
Generator for '#omp target'.
Loc | where the target data construct was encountered. |
IsOffloadEntry | whether it is an offload entry. |
CodeGenIP | The insertion point where the call to the outlined function should be emitted. |
EntryInfo | The entry information about the function. |
NumTeams | Number of teams specified in the num_teams clause. |
NumThreads | Number of teams specified in the thread_limit clause. |
Inputs | The input values to the region that will be passed. as arguments to the outlined function. |
BodyGenCB | Callback that will generate the region code. |
ArgAccessorFuncCB | Callback that will generate accessors instructions for passed in target arguments where neccessary |
Dependencies | A vector of DependData objects that carry |
Definition at line 7167 of file OMPIRBuilder.cpp.
References Builder, Config, emitTargetCall(), emitTargetOutlinedFunction(), llvm::OpenMPIRBuilderConfig::isTargetDevice(), llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), and updateToLocation().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createTargetData | ( | const LocationDescription & | Loc, |
InsertPointTy | AllocaIP, | ||
InsertPointTy | CodeGenIP, | ||
Value * | DeviceID, | ||
Value * | IfCond, | ||
TargetDataInfo & | Info, | ||
GenMapInfoCallbackTy | GenMapInfoCB, | ||
omp::RuntimeFunction * | MapperFunc = nullptr , |
||
function_ref< InsertPointTy(InsertPointTy CodeGenIP, BodyGenTy BodyGenType)> | BodyGenCB = nullptr , |
||
function_ref< void(unsigned int, Value *)> | DeviceAddrCB = nullptr , |
||
function_ref< Value *(unsigned int)> | CustomMapperCB = nullptr , |
||
Value * | SrcLocInfo = nullptr |
||
) |
Generator for '#omp target data'.
Loc | The location where the target data construct was encountered. |
AllocaIP | The insertion points to be used for alloca instructions. |
CodeGenIP | The insertion point at which the target directive code should be placed. |
IsBegin | If true then emits begin mapper call otherwise emits end mapper call. |
DeviceID | Stores the DeviceID from the device clause. |
IfCond | Value which corresponds to the if clause condition. |
Info | Stores all information realted to the Target Data directive. |
GenMapInfoCB | Callback that populates the MapInfos and returns. |
BodyGenCB | Optional Callback to generate the region code. |
DeviceAddrCB | Optional callback to generate code related to use_device_ptr and use_device_addr. |
CustomMapperCB | Optional callback to generate code related to custom mappers. |
Definition at line 6351 of file OMPIRBuilder.cpp.
References assert(), llvm::OpenMPIRBuilder::TargetDataRTArgs::BasePointersArray, Builder, Config, llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateLoad(), llvm::IRBuilderBase::CreateStore(), DupNoPriv, emitIfClause(), emitOffloadingArrays(), emitOffloadingArraysArgument(), llvm::SmallVectorBase< Size_T >::empty(), llvm::IRBuilderBase::getInt32(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), llvm::IRBuilderBase::getPtrTy(), Info, llvm::OpenMPIRBuilderConfig::IsTargetDevice, llvm::OpenMPIRBuilder::TargetDataRTArgs::MapNamesArray, llvm::OpenMPIRBuilder::TargetDataRTArgs::MappersArray, llvm::OpenMPIRBuilder::TargetDataRTArgs::MapTypesArray, llvm::OpenMPIRBuilder::MapInfosTy::Names, NoPriv, llvm::OpenMPIRBuilder::TargetDataRTArgs::PointersArray, Priv, llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), llvm::OpenMPIRBuilder::TargetDataRTArgs::SizesArray, and updateToLocation().
void OpenMPIRBuilder::createTargetDeinit | ( | const LocationDescription & | Loc, |
int32_t | TeamsReductionDataSize = 0 , |
||
int32_t | TeamsReductionBufferLength = 1024 |
||
) |
Create a runtime call for kmpc_target_deinit.
Loc | The insert and source location description. |
TeamsReductionDataSize | The maximal size of all the reduction data for teams reduction. |
TeamsReductionBufferLength | The number of elements (each of up to TeamsReductionDataSize size), in the teams reduction buffer. |
Definition at line 6143 of file OMPIRBuilder.cpp.
References assert(), Builder, llvm::ConstantFoldInsertValueInstruction(), llvm::IRBuilderBase::CreateCall(), llvm::StringRef::drop_back(), llvm::StringRef::ends_with(), llvm::IRBuilderBase::GetInsertBlock(), llvm::Value::getName(), llvm::Module::getNamedGlobal(), getOrCreateRuntimeFunctionPtr(), llvm::BasicBlock::getParent(), Int32, M, and updateToLocation().
Referenced by createOutlinedFunction().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createTargetInit | ( | const LocationDescription & | Loc, |
bool | IsSPMD, | ||
int32_t | MinThreadsVal = 0 , |
||
int32_t | MaxThreadsVal = 0 , |
||
int32_t | MinTeamsVal = 0 , |
||
int32_t | MaxTeamsVal = 0 |
||
) |
The omp target
interface.
For more information about the usage of this interface,
{ Create a runtime call for kmpc_target_init
Loc | The insert and source location description. |
IsSPMD | Flag to indicate if the kernel is an SPMD kernel or not. |
MinThreads | Minimal number of threads, or 0. |
MaxThreads | Maximal number of threads, or 0. |
MinTeams | Minimal number of teams, or 0. |
MaxTeams | Maximal number of teams, or 0. |
Definition at line 6009 of file OMPIRBuilder.cpp.
References assert(), Builder, llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateCondBr(), llvm::IRBuilderBase::CreateICmpEQ(), llvm::IRBuilderBase::CreateRetVoid(), llvm::IRBuilderBase::CreateUnreachable(), DL, llvm::StringRef::drop_back(), llvm::StringRef::ends_with(), llvm::Instruction::eraseFromParent(), llvm::ConstantStruct::get(), llvm::ConstantExpr::getAddrSpaceCast(), llvm::Constant::getAllOnesValue(), llvm::Function::getArg(), llvm::BasicBlock::getContext(), llvm::Function::getDataLayout(), llvm::BasicBlock::getFirstInsertionPt(), llvm::Module::getFunction(), getGridValue(), llvm::IRBuilderBase::GetInsertBlock(), llvm::Value::getName(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), llvm::BasicBlock::getParent(), llvm::ConstantInt::getSigned(), llvm::BasicBlock::getTerminator(), llvm::GlobalValue::getType(), llvm::Value::getType(), Int16, Int32, llvm::OpenMPIRBuilder::LocationDescription::IP, M, MaxThreads, llvm::GlobalValue::NotThreadLocal, llvm::omp::OMP_TGT_EXEC_MODE_GENERIC, llvm::omp::OMP_TGT_EXEC_MODE_SPMD, llvm::GlobalValue::ProtectedVisibility, llvm::IRBuilderBase::SetInsertPoint(), llvm::GlobalValue::setVisibility(), llvm::BasicBlock::splitBasicBlock(), updateToLocation(), llvm::GlobalValue::WeakODRLinkage, writeTeamsForKernel(), and writeThreadBoundsForKernel().
Referenced by createOutlinedFunction().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createTask | ( | const LocationDescription & | Loc, |
InsertPointTy | AllocaIP, | ||
BodyGenCallbackTy | BodyGenCB, | ||
bool | Tied = true , |
||
Value * | Final = nullptr , |
||
Value * | IfCondition = nullptr , |
||
SmallVector< DependData > | Dependencies = {} |
||
) |
Generator for #omp task
Loc | The location where the task construct was encountered. |
AllocaIP | The insertion point to be used for alloca instructions. |
BodyGenCB | Callback that will generate the region code. |
Tied | True if the task is tied, false if the task is untied. |
Final | i1 value which is true if the task is final, false if the task is not final. |
IfCondition | i1 value. If it evaluates to false , an undeferred task is generated, and the encountering thread must suspend the current task region, for which execution cannot be resumed until execution of the structured block that is associated with the generated task is completed. |
Definition at line 1791 of file OMPIRBuilder.cpp.
References addOutlineInfo(), Addr, llvm::CallBase::arg_size(), assert(), llvm::BasicBlock::back(), llvm::sampleprof::Base, llvm::BasicBlock::begin(), Builder, llvm::IRBuilderBase::CreateAlloca(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateConstInBoundsGEP2_64(), createFakeIntVal(), llvm::IRBuilderBase::CreateLoad(), llvm::IRBuilderBase::CreateMemCpy(), llvm::IRBuilderBase::CreateOr(), llvm::IRBuilderBase::CreatePtrToInt(), llvm::IRBuilderBase::CreateSelect(), llvm::IRBuilderBase::CreateStore(), llvm::IRBuilderBase::CreateStructGEP(), llvm::divideCeil(), llvm::OpenMPIRBuilder::OutlineInfo::EntryBB, llvm::Instruction::eraseFromParent(), llvm::OpenMPIRBuilder::OutlineInfo::ExcludeArgsFromAggregate, llvm::OpenMPIRBuilder::OutlineInfo::ExitBB, llvm::omp::Flags, llvm::ConstantPointerNull::get(), llvm::ArrayType::get(), llvm::AllocaInst::getAllocatedType(), llvm::CallBase::getArgOperand(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::Module::getContext(), llvm::Module::getDataLayout(), llvm::Instruction::getDebugLoc(), llvm::Function::getEntryBlock(), llvm::IRBuilderBase::GetInsertPoint(), llvm::IRBuilderBase::getInt32(), llvm::IRBuilderBase::getInt32Ty(), llvm::IRBuilderBase::getInt64(), llvm::IRBuilderBase::getInt64Ty(), llvm::IRBuilderBase::getInt8Ty(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::BasicBlock::getParent(), llvm::Value::getPointerAlignment(), llvm::DataLayout::getTypeSizeInBits(), llvm::DataLayout::getTypeStoreSize(), llvm::PointerType::getUnqual(), I, M, llvm::OpenMPIRBuilder::OutlineInfo::OuterAllocaBB, P, llvm::OpenMPIRBuilder::OutlineInfo::PostOutlineCB, llvm::Value::replaceUsesWithIf(), llvm::IRBuilderBase::restoreIP(), llvm::reverse(), llvm::IRBuilderBase::saveIP(), llvm::Instruction::setDebugLoc(), llvm::IRBuilderBase::SetInsertPoint(), llvm::SmallVectorBase< Size_T >::size(), Size, llvm::splitBB(), llvm::SplitBlockAndInsertIfThenElse(), and updateToLocation().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createTaskgroup | ( | const LocationDescription & | Loc, |
InsertPointTy | AllocaIP, | ||
BodyGenCallbackTy | BodyGenCB | ||
) |
Generator for the taskgroup construct.
Loc | The location where the taskgroup construct was encountered. |
AllocaIP | The insertion point to be used for alloca instructions. |
BodyGenCB | Callback that will generate the region code. |
Definition at line 2042 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), llvm::splitBB(), and updateToLocation().
void OpenMPIRBuilder::createTaskwait | ( | const LocationDescription & | Loc | ) |
Generator for '#omp taskwait'.
Loc | The location where the taskwait directive was encountered. |
Definition at line 1698 of file OMPIRBuilder.cpp.
References emitTaskwaitImpl(), and updateToLocation().
void OpenMPIRBuilder::createTaskyield | ( | const LocationDescription & | Loc | ) |
Generator for '#omp taskyield'.
Loc | The location where the taskyield directive was encountered. |
Definition at line 1716 of file OMPIRBuilder.cpp.
References emitTaskyieldImpl(), and updateToLocation().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createTeams | ( | const LocationDescription & | Loc, |
BodyGenCallbackTy | BodyGenCB, | ||
Value * | NumTeamsLower = nullptr , |
||
Value * | NumTeamsUpper = nullptr , |
||
Value * | ThreadLimit = nullptr , |
||
Value * | IfExpr = nullptr |
||
) |
Generator for #omp teams
Loc | The location where the teams construct was encountered. |
BodyGenCB | Callback that will generate the region code. |
NumTeamsLower | Lower bound on number of teams. If this is nullptr, it is as if lower bound is specified as equal to upperbound. If this is non-null, then upperbound must also be non-null. |
NumTeamsUpper | Upper bound on the number of teams. |
ThreadLimit | on the number of threads that may participate in a contention group created by each team. |
IfExpr | is the integer argument value of the if condition on the teams clause. |
Definition at line 8227 of file OMPIRBuilder.cpp.
References addOutlineInfo(), llvm::Function::arg_size(), llvm::CallBase::arg_size(), assert(), llvm::BasicBlock::begin(), Builder, Config, llvm::IRBuilderBase::CreateCall(), createFakeIntVal(), llvm::IRBuilderBase::CreateICmpNE(), llvm::IRBuilderBase::CreateSelect(), llvm::OpenMPIRBuilder::OutlineInfo::EntryBB, llvm::OpenMPIRBuilder::OutlineInfo::ExcludeArgsFromAggregate, llvm::OpenMPIRBuilder::OutlineInfo::ExitBB, llvm::Function::getArg(), llvm::CallBase::getArgOperand(), llvm::Function::getEntryBlock(), llvm::IRBuilderBase::GetInsertBlock(), llvm::IRBuilderBase::getInt32(), llvm::Value::getNumUses(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::BasicBlock::getParent(), llvm::Value::getType(), I, Int1, llvm::Type::isIntegerTy(), llvm::OpenMPIRBuilderConfig::isTargetDevice(), llvm::OpenMPIRBuilder::OutlineInfo::OuterAllocaBB, llvm::OpenMPIRBuilder::OutlineInfo::PostOutlineCB, llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::reverse(), llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), llvm::Value::setName(), llvm::splitBB(), updateToLocation(), and llvm::Value::user_back().
void OpenMPIRBuilder::emitBlock | ( | BasicBlock * | BB, |
Function * | CurFn, | ||
bool | IsFinished = false |
||
) |
Definition at line 7656 of file OMPIRBuilder.cpp.
References Builder, emitBranch(), llvm::Function::end(), llvm::BasicBlock::eraseFromParent(), llvm::IRBuilderBase::GetInsertBlock(), llvm::ilist_node_impl< OptionsT >::getIterator(), llvm::BasicBlock::getParent(), llvm::Function::insert(), llvm::IRBuilderBase::SetInsertPoint(), and llvm::Value::use_empty().
Referenced by createReductionsGPU(), emitIfClause(), and emitKernelLaunch().
void OpenMPIRBuilder::emitBranch | ( | BasicBlock * | Target | ) |
Definition at line 7642 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::ClearInsertionPoint(), llvm::IRBuilderBase::CreateBr(), llvm::IRBuilderBase::GetInsertBlock(), and llvm::BasicBlock::getTerminator().
Referenced by emitBlock(), emitIfClause(), and emitKernelLaunch().
void OpenMPIRBuilder::emitCancelationCheckImpl | ( | Value * | CancelFlag, |
omp::Directive | CanceledDirective, | ||
FinalizeCallbackTy | ExitCB = {} |
||
) |
Generate control flow and cleanup for cancellation.
CancelFlag | Flag indicating if the cancellation is performed. |
CanceledDirective | The kind of directive that is cancled. |
ExitCB | Extra code to be generated in the exit block. |
Definition at line 1142 of file OMPIRBuilder.cpp.
References assert(), llvm::BasicBlock::begin(), Builder, llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateCondBr(), llvm::IRBuilderBase::CreateIsNull(), llvm::BasicBlock::end(), llvm::Instruction::eraseFromParent(), FinalizationStack, llvm::BasicBlock::getContext(), llvm::IRBuilderBase::GetInsertBlock(), llvm::IRBuilderBase::GetInsertPoint(), llvm::Value::getName(), llvm::BasicBlock::getParent(), llvm::BasicBlock::getTerminator(), isLastFinalizationInfoCancellable(), llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), and llvm::SplitBlock().
Referenced by createBarrier(), and createCancel().
void OpenMPIRBuilder::emitFlush | ( | const LocationDescription & | Loc | ) |
Generate a flush runtime call.
Loc | The location at which the request originated and is fulfilled. |
Definition at line 1670 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), and getOrCreateSrcLocStr().
Referenced by createFlush().
void OpenMPIRBuilder::emitIfClause | ( | Value * | Cond, |
BodyGenCallbackTy | ThenGen, | ||
BodyGenCallbackTy | ElseGen, | ||
InsertPointTy | AllocaIP = {} |
||
) |
Emits code for OpenMP 'if' clause using specified BodyGenCallbackTy Here is the logic: if (Cond) { ThenGen(); } else { ElseGen(); }.
Definition at line 7677 of file OMPIRBuilder.cpp.
References Builder, Cond, llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateCondBr(), emitBlock(), emitBranch(), llvm::Module::getContext(), llvm::IRBuilderBase::GetInsertBlock(), llvm::BasicBlock::getParent(), M, and llvm::IRBuilderBase::saveIP().
Referenced by createTargetData().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::emitKernelLaunch | ( | const LocationDescription & | Loc, |
Function * | OutlinedFn, | ||
Value * | OutlinedFnID, | ||
EmitFallbackCallbackTy | EmitTargetCallFallbackCB, | ||
TargetKernelArgs & | Args, | ||
Value * | DeviceID, | ||
Value * | RTLoc, | ||
InsertPointTy | AllocaIP | ||
) |
Generate a target region entry call and host fallback call.
Loc | The location at which the request originated and is fulfilled. |
OutlinedFn | The outlined kernel function. |
OutlinedFnID | The ooulined function ID. |
EmitTargetCallFallbackCB | Call back function to generate host fallback code. |
Args | Data structure holding information about the kernel arguments. |
DeviceID | Identifier for the device via the 'device' clause. |
RTLoc | Source location identifier |
AllocaIP | The insertion point to be used for alloca instructions. |
Definition at line 1081 of file OMPIRBuilder.cpp.
References assert(), Builder, llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateCondBr(), llvm::IRBuilderBase::CreateIsNotNull(), emitBlock(), emitBranch(), emitTargetKernel(), llvm::Failed(), llvm::IRBuilderBase::getContext(), llvm::IRBuilderBase::GetInsertBlock(), getKernelArgsVector(), llvm::BasicBlock::getParent(), llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), and updateToLocation().
Referenced by emitTargetTask().
void OpenMPIRBuilder::emitMapperCall | ( | const LocationDescription & | Loc, |
Function * | MapperFunc, | ||
Value * | SrcLocInfo, | ||
Value * | MaptypesArg, | ||
Value * | MapnamesArg, | ||
struct MapperAllocas & | MapperAllocas, | ||
int64_t | DeviceID, | ||
unsigned | NumOperands | ||
) |
Create the call for the target mapper function.
Loc | The source location description. |
MapperFunc | Function to be called. |
SrcLocInfo | Source location information global. |
MaptypesArg | The argument types. |
MapnamesArg | The argument names. |
MapperAllocas | The AllocaInst used for the call. |
DeviceID | Device ID for the call. |
NumOperands | Number of operands in the call. |
Definition at line 7298 of file OMPIRBuilder.cpp.
References llvm::OpenMPIRBuilder::MapperAllocas::Args, llvm::OpenMPIRBuilder::MapperAllocas::ArgsBase, llvm::OpenMPIRBuilder::MapperAllocas::ArgSizes, Builder, llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateInBoundsGEP(), llvm::ArrayType::get(), llvm::IRBuilderBase::getInt32(), llvm::IRBuilderBase::getInt64(), llvm::Constant::getNullValue(), llvm::PointerType::getUnqual(), Int64, and updateToLocation().
void OpenMPIRBuilder::emitNonContiguousDescriptor | ( | InsertPointTy | AllocaIP, |
InsertPointTy | CodeGenIP, | ||
MapInfosTy & | CombinedInfo, | ||
TargetDataInfo & | Info | ||
) |
Emit an array of struct descriptors to be assigned to the offload args.
Definition at line 7383 of file OMPIRBuilder.cpp.
References Builder, llvm::OpenMPIRBuilder::MapInfosTy::StructNonContiguousInfo::Counts, llvm::StructType::create(), llvm::IRBuilderBase::CreateAlignedStore(), llvm::IRBuilderBase::CreateAlloca(), llvm::IRBuilderBase::CreateConstInBoundsGEP2_32(), llvm::IRBuilderBase::CreateInBoundsGEP(), llvm::IRBuilderBase::CreatePointerBitCastOrAddrSpaceCast(), llvm::IRBuilderBase::CreateStructGEP(), llvm::OpenMPIRBuilder::MapInfosTy::StructNonContiguousInfo::Dims, llvm::ArrayType::get(), llvm::AllocaInst::getAllocatedType(), llvm::Module::getContext(), llvm::Module::getDataLayout(), llvm::IRBuilderBase::getInt64Ty(), llvm::DataLayout::getPrefTypeAlign(), llvm::IRBuilderBase::getPtrTy(), llvm::Value::getType(), I, II, Info, M, llvm::OpenMPIRBuilder::MapInfosTy::NonContigInfo, llvm::OpenMPIRBuilder::MapInfosTy::StructNonContiguousInfo::Offsets, P, llvm::IRBuilderBase::restoreIP(), llvm::SmallVectorBase< Size_T >::size(), and llvm::OpenMPIRBuilder::MapInfosTy::StructNonContiguousInfo::Strides.
Referenced by emitOffloadingArrays().
void OpenMPIRBuilder::emitOffloadingArrays | ( | InsertPointTy | AllocaIP, |
InsertPointTy | CodeGenIP, | ||
MapInfosTy & | CombinedInfo, | ||
TargetDataInfo & | Info, | ||
bool | IsNonContiguous = false , |
||
function_ref< void(unsigned int, Value *)> | DeviceAddrCB = nullptr , |
||
function_ref< Value *(unsigned int)> | CustomMapperCB = nullptr |
||
) |
Emit the arrays used to pass the captures and map information to the offloading runtime library.
If there is no map or capture information, return nullptr by reference. Accepts a reference to a MapInfosTy object that contains information generated for mappable clauses, including base pointers, pointers, sizes, map types, user-defined mappers.
Definition at line 7451 of file OMPIRBuilder.cpp.
References Address, llvm::SmallBitVector::all(), llvm::SmallBitVector::any(), llvm::OpenMPIRBuilder::MapInfosTy::BasePointers, Builder, llvm::IRBuilderBase::CreateAlignedStore(), llvm::IRBuilderBase::CreateAlloca(), llvm::IRBuilderBase::CreateConstInBoundsGEP2_32(), llvm::IRBuilderBase::CreateInBoundsGEP(), llvm::IRBuilderBase::CreateIntCast(), llvm::IRBuilderBase::CreateMemCpy(), createOffloadMapnames(), createOffloadMaptypes(), createPlatformSpecificName(), llvm::IRBuilderBase::CreatePointerCast(), llvm::OpenMPIRBuilder::MapInfosTy::DevicePointers, llvm::OpenMPIRBuilder::MapInfosTy::StructNonContiguousInfo::Dims, emitNonContiguousDescriptor(), llvm::SmallVectorBase< Size_T >::empty(), llvm::ConstantArray::get(), llvm::ConstantPointerNull::get(), llvm::ArrayType::get(), llvm::DataLayout::getABIIntegerTypeAlignment(), llvm::AllocaInst::getAllocatedType(), llvm::AllocaInst::getAllocationSize(), llvm::IRBuilderBase::getContext(), llvm::Module::getDataLayout(), llvm::DataLayout::getIndexSizeInBits(), llvm::IRBuilderBase::getInt64Ty(), llvm::IRBuilderBase::getIntN(), llvm::Constant::getNullValue(), llvm::DataLayout::getPrefTypeAlign(), llvm::IRBuilderBase::getPtrTy(), llvm::AllocaInst::getType(), llvm::Value::getType(), llvm::PointerType::getUnqual(), llvm::GlobalValue::Global, I, Info, M, Name, llvm::OpenMPIRBuilder::MapInfosTy::Names, llvm::OpenMPIRBuilder::MapInfosTy::NonContigInfo, llvm::OpenMPIRBuilder::MapInfosTy::StructNonContiguousInfo::Offsets, P, Pointer, llvm::OpenMPIRBuilder::MapInfosTy::Pointers, llvm::GlobalValue::PrivateLinkage, llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), llvm::SmallBitVector::set(), llvm::AllocaInst::setAlignment(), llvm::SmallVectorBase< Size_T >::size(), llvm::OpenMPIRBuilder::MapInfosTy::Sizes, llvm::SmallBitVector::test(), and llvm::OpenMPIRBuilder::MapInfosTy::Types.
Referenced by createTargetData(), and emitOffloadingArraysAndArgs().
void OpenMPIRBuilder::emitOffloadingArraysAndArgs | ( | InsertPointTy | AllocaIP, |
InsertPointTy | CodeGenIP, | ||
TargetDataInfo & | Info, | ||
TargetDataRTArgs & | RTArgs, | ||
MapInfosTy & | CombinedInfo, | ||
bool | IsNonContiguous = false , |
||
bool | ForEndCall = false , |
||
function_ref< void(unsigned int, Value *)> | DeviceAddrCB = nullptr , |
||
function_ref< Value *(unsigned int)> | CustomMapperCB = nullptr |
||
) |
Allocates memory for and populates the arrays required for offloading (offload_{baseptrs|ptrs|mappers|sizes|maptypes|mapnames}).
Then, it emits their base addresses as arguments to be passed to the runtime library. In essence, this function is a combination of emitOffloadingArrays and emitOffloadingArraysArgument and should arguably be preferred by clients of OpenMPIRBuilder.
Definition at line 7071 of file OMPIRBuilder.cpp.
References Builder, emitOffloadingArrays(), emitOffloadingArraysArgument(), and Info.
void OpenMPIRBuilder::emitOffloadingArraysArgument | ( | IRBuilderBase & | Builder, |
OpenMPIRBuilder::TargetDataRTArgs & | RTArgs, | ||
OpenMPIRBuilder::TargetDataInfo & | Info, | ||
bool | ForEndCall = false |
||
) |
Emit the arguments to be passed to the runtime library based on the arrays of base pointers, pointers, sizes, map types, and mappers.
If ForEndCall, emit map types to be passed for the end of the region instead of the beginning.
Definition at line 7325 of file OMPIRBuilder.cpp.
References assert(), llvm::OpenMPIRBuilder::TargetDataRTArgs::BasePointersArray, Builder, llvm::IRBuilderBase::CreateConstInBoundsGEP2_32(), llvm::IRBuilderBase::CreatePointerCast(), llvm::ConstantPointerNull::get(), llvm::ArrayType::get(), llvm::Module::getContext(), llvm::Type::getInt64Ty(), llvm::PointerType::getUnqual(), Info, M, llvm::OpenMPIRBuilder::TargetDataRTArgs::MapNamesArray, llvm::OpenMPIRBuilder::TargetDataRTArgs::MappersArray, llvm::OpenMPIRBuilder::TargetDataRTArgs::MapTypesArray, llvm::OpenMPIRBuilder::TargetDataRTArgs::PointersArray, and llvm::OpenMPIRBuilder::TargetDataRTArgs::SizesArray.
Referenced by createTargetData(), and emitOffloadingArraysAndArgs().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::emitTargetKernel | ( | const LocationDescription & | Loc, |
InsertPointTy | AllocaIP, | ||
Value *& | Return, | ||
Value * | Ident, | ||
Value * | DeviceID, | ||
Value * | NumTeams, | ||
Value * | NumThreads, | ||
Value * | HostPtr, | ||
ArrayRef< Value * > | KernelArgs | ||
) |
Generate a target region entry call.
Loc | The location at which the request originated and is fulfilled. |
AllocaIP | The insertion point to be used for alloca instructions. |
Return | Return value of the created function returned by reference. |
DeviceID | Identifier for the device via the 'device' clause. |
NumTeams | Numer of teams for the region via the 'num_teams' clause or 0 if unspecified and -1 if there is no 'teams' clause. |
NumThreads | Number of threads via the 'thread_limit' clause. |
HostPtr | Pointer to the host-side pointer of the target kernel. |
KernelArgs | Array of arguments to the kernel. |
Definition at line 1051 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateAlignedStore(), llvm::IRBuilderBase::CreateAlloca(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateStructGEP(), llvm::Module::getDataLayout(), getOrCreateRuntimeFunction(), llvm::DataLayout::getPrefTypeAlign(), I, llvm::OpenMPIRBuilder::LocationDescription::IP, M, llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), llvm::ArrayRef< T >::size(), Size, and updateToLocation().
Referenced by emitKernelLaunch().
void OpenMPIRBuilder::emitTargetRegionFunction | ( | TargetRegionEntryInfo & | EntryInfo, |
FunctionGenCallback & | GenerateFunctionCallback, | ||
bool | IsOffloadEntry, | ||
Function *& | OutlinedFn, | ||
Constant *& | OutlinedFnID | ||
) |
Create a unique name for the entry function using the source location information of the current target region.
The name will be something like:
__omp_offloading_DD_FFFF_PP_lBB[_CC]
where DD_FFFF is an ID unique to the file (device and file IDs), PP is the mangled name of the function that encloses the target region and BB is the line number of the target region. CC is a count added when more than one region is located at the same location.
If this target outline function is not an offload entry, we don't need to register it. This may happen if it is guarded by an if clause that is false at compile time, or no target archs have been specified.
The created target region ID is used by the runtime library to identify the current target region, so it only has to be unique and not necessarily point to anything. It could be the pointer to the outlined function that implements the target region, but we aren't using that so that the compiler doesn't need to keep that, and could therefore inline the host function if proven worthwhile during optimization. In the other hand, if emitting code for the device, the ID has to be the function address so that it can retrieved from the offloading entry and launched by the runtime library. We also mark the outlined function to have external linkage in case we are emitting code for the device, because these functions will be entry points to the device.
InfoManager | The info manager keeping track of the offload entries |
EntryInfo | The entry information about the function |
GenerateFunctionCallback | The callback function to generate the code |
OutlinedFunction | Pointer to the outlined function |
EntryFnIDName | Name of the ID o be created |
Definition at line 6311 of file OMPIRBuilder.cpp.
References Config, createPlatformSpecificName(), llvm::OffloadEntriesInfoManager::getTargetRegionEntryFnName(), llvm::OpenMPIRBuilderConfig::isTargetDevice(), OffloadInfoManager, llvm::OpenMPIRBuilderConfig::openMPOffloadMandatory(), and registerTargetRegionFunction().
Referenced by emitTargetOutlinedFunction().
OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::emitTargetTask | ( | Function * | OutlinedFn, |
Value * | OutlinedFnID, | ||
EmitFallbackCallbackTy | EmitTargetCallFallbackCB, | ||
TargetKernelArgs & | Args, | ||
Value * | DeviceID, | ||
Value * | RTLoc, | ||
OpenMPIRBuilder::InsertPointTy | AllocaIP, | ||
SmallVector< OpenMPIRBuilder::DependData > & | Dependencies, | ||
bool | HasNoWait | ||
) |
Generate a target-task for the target construct.
OutlinedFn | The outlined device/target kernel function. |
OutlinedFnID | The ooulined function ID. |
EmitTargetCallFallbackCB | Call back function to generate host fallback code. |
Args | Data structure holding information about the kernel arguments. |
DeviceID | Identifier for the device via the 'device' clause. |
RTLoc | Source location identifier |
AllocaIP | The insertion point to be used for alloca instructions. |
Dependencies | Vector of DependData objects holding information of dependencies as specified by the 'depend' clause. |
HasNoWait | True if the target construct had 'nowait' on it, false otherwise |
Definition at line 6799 of file OMPIRBuilder.cpp.
References addOutlineInfo(), llvm::CallBase::arg_size(), assert(), llvm::BasicBlock::begin(), Builder, llvm::IRBuilderBase::CreateCall(), createFakeIntVal(), llvm::IRBuilderBase::CreateLoad(), llvm::IRBuilderBase::CreateMemCpy(), llvm::dbgs(), emitKernelLaunch(), emitTargetTaskProxyFunction(), emitTaskDependencies(), llvm::OpenMPIRBuilder::OutlineInfo::EntryBB, llvm::Instruction::eraseFromParent(), llvm::OpenMPIRBuilder::OutlineInfo::ExcludeArgsFromAggregate, llvm::OpenMPIRBuilder::OutlineInfo::ExitBB, llvm::omp::Flags, llvm::ConstantPointerNull::get(), llvm::CallBase::getArgOperand(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::Module::getContext(), llvm::Module::getDataLayout(), llvm::Instruction::getDebugLoc(), llvm::IRBuilderBase::GetInsertBlock(), llvm::IRBuilderBase::getInt32(), llvm::IRBuilderBase::getInt32Ty(), llvm::IRBuilderBase::getInt64(), llvm::Value::getNumUses(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::GlobalValue::getParent(), llvm::BasicBlock::getParent(), llvm::Value::getPointerAlignment(), llvm::DataLayout::getTypeStoreSize(), llvm::PointerType::getUnqual(), I, LLVM_DEBUG, M, llvm::OpenMPIRBuilder::OutlineInfo::OuterAllocaBB, llvm::OpenMPIRBuilder::OutlineInfo::PostOutlineCB, llvm::IRBuilderBase::restoreIP(), llvm::reverse(), llvm::IRBuilderBase::saveIP(), llvm::Instruction::setDebugLoc(), llvm::IRBuilderBase::SetInsertPoint(), llvm::SmallVectorBase< Size_T >::size(), llvm::splitBB(), and llvm::Value::user_back().
void OpenMPIRBuilder::emitTaskwaitImpl | ( | const LocationDescription & | Loc | ) |
Generate a taskwait runtime call.
Loc | The location at which the request originated and is fulfilled. |
Definition at line 1685 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), and getOrCreateThreadID().
Referenced by createTaskwait().
void OpenMPIRBuilder::emitTaskyieldImpl | ( | const LocationDescription & | Loc | ) |
Generate a taskyield runtime call.
Loc | The location at which the request originated and is fulfilled. |
Definition at line 1704 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateCall(), llvm::Constant::getNullValue(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), and Int32.
Referenced by createTaskyield().
void OpenMPIRBuilder::finalize | ( | Function * | Fn = nullptr | ) |
Finalize the underlying module, e.g., by outlining regions.
Fn | The function to be finalized. If not used, all functions are finalized. |
Definition at line 674 of file OMPIRBuilder.cpp.
References llvm::Function::addFnAttr(), assert(), Blocks, Builder, llvm::SmallPtrSetImplBase::clear(), Config, ConstantAllocaRaiseCandidates, createOffloadEntriesAndInfoMetadata(), llvm::dbgs(), llvm::OpenMPIRBuilderConfig::EmitLLVMUsedMetaInfo, llvm::OffloadEntriesInfoManager::empty(), llvm::BasicBlock::empty(), End, llvm::BasicBlock::eraseFromParent(), llvm::errs(), llvm::CodeExtractor::excludeArgFromAggregate(), llvm::CodeExtractor::extractCodeRegion(), F, llvm::Function::getEntryBlock(), llvm::Function::getFnAttribute(), llvm::Function::getFunction(), llvm::Module::getFunctionList(), llvm::Module::getGlobalVariable(), llvm::ilist_node_impl< OptionsT >::getIterator(), llvm::Value::getNumUses(), llvm::Function::getReturnType(), llvm::BasicBlock::getUniqueSuccessor(), I, llvm::iplist_impl< IntrusiveListT, TraitsT >::insertAfter(), llvm::CodeExtractor::isEligible(), llvm::OpenMPIRBuilderConfig::isTargetDevice(), llvm::Type::isVoidTy(), LLVM_DEBUG, M, OffloadInfoManager, OutlineInfos, llvm::SmallVectorTemplateBase< T, bool >::push_back(), raiseUserConstantDataAllocasToEntryBlock(), llvm::BasicBlock::rbegin(), llvm::Function::removeFromParent(), and llvm::BasicBlock::rend().
Constant * OpenMPIRBuilder::getAddrOfDeclareTargetVar | ( | OffloadEntriesInfoManager::OMPTargetGlobalVarEntryKind | CaptureClause, |
OffloadEntriesInfoManager::OMPTargetDeviceClauseKind | DeviceClause, | ||
bool | IsDeclaration, | ||
bool | IsExternallyVisible, | ||
TargetRegionEntryInfo | EntryInfo, | ||
StringRef | MangledName, | ||
std::vector< GlobalVariable * > & | GeneratedRefs, | ||
bool | OpenMPSIMD, | ||
std::vector< Triple > | TargetTriple, | ||
Type * | LlvmPtrTy, | ||
std::function< Constant *()> | GlobalInitializer, | ||
std::function< GlobalValue::LinkageTypes()> | VariableLinkage | ||
) |
Retrieve (or create if non-existent) the address of a declare target variable, used in conjunction with registerTargetGlobalVariable to create declare target global variables.
CaptureClause | - enumerator corresponding to the OpenMP capture clause used in conjunction with the variable being registered (link, to, enter). |
DeviceClause | - enumerator corresponding to the OpenMP capture clause used in conjunction with the variable being registered (nohost, host, any) |
IsDeclaration | - boolean stating if the variable being registered is a declaration-only and not a definition |
IsExternallyVisible | - boolean stating if the variable is externally visible |
EntryInfo | - Unique entry information for the value generated using getTargetEntryUniqueInfo, used to name generated pointer references to the declare target variable |
MangledName | - the mangled name of the variable being registered |
GeneratedRefs | - references generated by invocations of registerTargetGlobalVariable invoked from getAddrOfDeclareTargetVar, these are required by Clang for book keeping. |
OpenMPSIMD | - if OpenMP SIMD mode is currently enabled |
TargetTriple | - The OpenMP device target triple we are compiling for |
LlvmPtrTy | - The type of the variable we are generating or retrieving an address for |
GlobalInitializer | - a lambda function which creates a constant used for initializing a pointer reference to the variable in certain cases. If a nullptr is passed, it will default to utilising the original variable to initialize the pointer reference. |
VariableLinkage | - a lambda function which returns the variables linkage type, if unspecified and a nullptr is given, it will instead utilise the linkage stored on the existing global variable in the LLVMModule. |
Definition at line 8691 of file OMPIRBuilder.cpp.
References Config, llvm::TargetRegionEntryInfo::FileID, llvm::format(), llvm::Module::getNamedValue(), getOrCreateInternalVariable(), llvm::OpenMPIRBuilderConfig::hasRequiresUnifiedSharedMemory(), llvm::OpenMPIRBuilderConfig::isTargetDevice(), M, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryEnter, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryLink, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryTo, OS, Ptr, registerTargetGlobalVariable(), and llvm::GlobalValue::WeakAnyLinkage.
Referenced by registerTargetGlobalVariable().
unsigned OpenMPIRBuilder::getFlagMemberOffset | ( | ) |
Get the offset of the OMP_MAP_MEMBER_OF field.
Definition at line 8655 of file OMPIRBuilder.cpp.
References llvm::Offset, and llvm::omp::OMP_MAP_MEMBER_OF.
Referenced by getMemberOfFlag().
|
inline |
}
Return the insertion point used by the underlying IRBuilder.
Definition at line 1932 of file OMPIRBuilder.h.
References Builder, and llvm::IRBuilderBase::saveIP().
|
static |
Create the kernel args vector used by emitTargetKernel.
This function creates various constant values that are used in the resulting args vector.
Definition at line 495 of file OMPIRBuilder.cpp.
References assert(), llvm::OpenMPIRBuilder::TargetDataRTArgs::BasePointersArray, Builder, llvm::IRBuilderBase::CreateInsertValue(), llvm::OpenMPIRBuilder::TargetKernelArgs::DynCGGroupMem, llvm::omp::Flags, llvm::ArrayType::get(), llvm::IRBuilderBase::getContext(), llvm::IRBuilderBase::getInt32(), llvm::Type::getInt32Ty(), llvm::IRBuilderBase::getInt64(), llvm::Constant::getNullValue(), llvm::OpenMPIRBuilder::TargetKernelArgs::HasNoWait, I, llvm::OpenMPIRBuilder::TargetDataRTArgs::MapNamesArray, llvm::OpenMPIRBuilder::TargetDataRTArgs::MappersArray, llvm::OpenMPIRBuilder::TargetDataRTArgs::MapTypesArray, llvm::OpenMPIRBuilder::TargetKernelArgs::NumIterations, llvm::OpenMPIRBuilder::TargetKernelArgs::NumTargetItems, llvm::OpenMPIRBuilder::TargetKernelArgs::NumTeams, llvm::OpenMPIRBuilder::TargetKernelArgs::NumThreads, OMP_KERNEL_ARG_VERSION, llvm::OpenMPIRBuilder::TargetDataRTArgs::PointersArray, llvm::OpenMPIRBuilder::TargetKernelArgs::RTArgs, llvm::OpenMPIRBuilder::TargetDataRTArgs::SizesArray, and llvm::Version.
Referenced by emitKernelLaunch().
omp::OpenMPOffloadMappingFlags OpenMPIRBuilder::getMemberOfFlag | ( | unsigned | Position | ) |
Get OMP_MAP_MEMBER_OF flag with extra bits reserved based on the position given.
Position | - A value indicating the position of the parent of the member in the kernel argument structure, often retrieved by the parents position in the combined information vectors used to generate the structure itself. Multiple children (member's of) with the same parent will use the same returned member flag. |
Definition at line 8666 of file OMPIRBuilder.cpp.
References getFlagMemberOffset().
|
static |
Get the default alignment value for given target.
Definition at line 5159 of file OMPIRBuilder.cpp.
References llvm::Triple::isPPC(), llvm::Triple::isWasm(), llvm::Triple::isX86(), and llvm::StringMap< ValueTy, AllocatorTy >::lookup().
Return the (LLVM-IR) string describing the default source location.
Definition at line 913 of file OMPIRBuilder.cpp.
References getOrCreateSrcLocStr().
Referenced by getOrCreateSrcLocStr().
Constant * OpenMPIRBuilder::getOrCreateIdent | ( | Constant * | SrcLocStr, |
uint32_t | SrcLocStrSize, | ||
omp::IdentFlag | Flags = omp::IdentFlag(0) , |
||
unsigned | Reserve2Flags = 0 |
||
) |
Return an ident_t* encoding the source location SrcLocStr
and Flags
.
TODO: Create a enum class for the Reserve2Flags
Definition at line 833 of file OMPIRBuilder.cpp.
References llvm::ConstantStruct::get(), llvm::Module::getDataLayout(), llvm::DataLayout::getDefaultGlobalsAddressSpace(), llvm::Constant::getNullValue(), llvm::ConstantExpr::getPointerBitCastOrAddrSpaceCast(), llvm::GlobalValue::Global, llvm::Module::globals(), IdentMap, Int32, M, llvm::GlobalValue::NotThreadLocal, and llvm::GlobalValue::PrivateLinkage.
Referenced by createBarrier(), createCachedThreadPrivate(), createCancel(), createCopyPrivate(), createCritical(), createMasked(), createMaster(), createOMPAlloc(), createOMPFree(), createOMPInteropDestroy(), createOMPInteropInit(), createOMPInteropUse(), createOrderedDepend(), createOrderedThreadsSimd(), createParallel(), createReductions(), createReductionsGPU(), createSingle(), createTargetData(), createTargetInit(), createTask(), createTaskgroup(), createTeams(), emitFlush(), emitTargetTask(), emitTaskwaitImpl(), and emitTaskyieldImpl().
GlobalVariable * OpenMPIRBuilder::getOrCreateInternalVariable | ( | Type * | Ty, |
const StringRef & | Name, | ||
unsigned | AddressSpace = 0 |
||
) |
Gets (if variable with the given name already exist) or creates internal global variable with the specified Name.
The created variable has linkage CommonLinkage by default and is initialized by null value.
Ty | Type of the global variable. If it is exist already the type must be the same. |
Name | Name of the variable. |
Definition at line 7219 of file OMPIRBuilder.cpp.
References assert(), llvm::GlobalValue::CommonLinkage, DL, llvm::Module::getDataLayout(), llvm::Constant::getNullValue(), llvm::Module::getTargetTriple(), llvm::GlobalValue::InternalLinkage, InternalVars, M, Name, and llvm::GlobalValue::NotThreadLocal.
Referenced by createCachedThreadPrivate(), getAddrOfDeclareTargetVar(), and registerTargetGlobalVariable().
FunctionCallee OpenMPIRBuilder::getOrCreateRuntimeFunction | ( | Module & | M, |
omp::RuntimeFunction | FnID | ||
) |
Return the function declaration for the runtime function with FnID
.
Definition at line 586 of file OMPIRBuilder.cpp.
References addAttributes(), llvm::GlobalObject::addMetadata(), assert(), llvm::MDBuilder::createCallbackEncoding(), llvm::dbgs(), llvm::MDNode::get(), llvm::Function::getContext(), llvm::Function::getFunctionType(), llvm::Value::getName(), llvm::GlobalObject::hasMetadata(), and LLVM_DEBUG.
Referenced by createDispatchDeinitFunction(), createDispatchFiniFunction(), createDispatchInitFunction(), createDispatchNextFunction(), createForStaticInitFunction(), createTargetLoopWorkshareCall(), emitTargetKernel(), getKmpcForDynamicFiniForType(), getKmpcForDynamicInitForType(), getKmpcForDynamicNextForType(), getKmpcForStaticInitForType(), getKmpcForStaticLoopForType(), and getOrCreateRuntimeFunctionPtr().
Function * OpenMPIRBuilder::getOrCreateRuntimeFunctionPtr | ( | omp::RuntimeFunction | FnID | ) |
Definition at line 642 of file OMPIRBuilder.cpp.
References assert(), llvm::FunctionCallee::getCallee(), getOrCreateRuntimeFunction(), and M.
Referenced by createBarrier(), createCachedThreadPrivate(), createCancel(), createCopyPrivate(), createCritical(), createMasked(), createMaster(), createOMPAlloc(), createOMPFree(), createOMPInteropDestroy(), createOMPInteropInit(), createOMPInteropUse(), createOrderedDepend(), createOrderedThreadsSimd(), createParallel(), createReductions(), createReductionsGPU(), createSingle(), createTargetData(), createTargetDeinit(), createTargetInit(), createTask(), createTaskgroup(), createTeams(), emitFlush(), emitTargetTask(), emitTaskwaitImpl(), emitTaskyieldImpl(), getOrCreateThreadID(), hostParallelCallback(), and targetParallelCallback().
Constant * OpenMPIRBuilder::getOrCreateSrcLocStr | ( | const LocationDescription & | Loc, |
uint32_t & | SrcLocStrSize | ||
) |
Return the (LLVM-IR) string describing the source location Loc
.
Definition at line 935 of file OMPIRBuilder.cpp.
References llvm::OpenMPIRBuilder::LocationDescription::DL, llvm::IRBuilderBase::InsertPoint::getBlock(), getOrCreateSrcLocStr(), llvm::BasicBlock::getParent(), and llvm::OpenMPIRBuilder::LocationDescription::IP.
Constant * OpenMPIRBuilder::getOrCreateSrcLocStr | ( | DebugLoc | DL, |
uint32_t & | SrcLocStrSize, | ||
Function * | F = nullptr |
||
) |
Return the (LLVM-IR) string describing the DebugLoc DL
.
Use F
as fallback if DL
does not specify the function name.
Definition at line 918 of file OMPIRBuilder.cpp.
References DL, llvm::Function::empty(), F, llvm::Module::getName(), getOrCreateDefaultSrcLocStr(), getOrCreateSrcLocStr(), and M.
Constant * OpenMPIRBuilder::getOrCreateSrcLocStr | ( | StringRef | FunctionName, |
StringRef | FileName, | ||
unsigned | Line, | ||
unsigned | Column, | ||
uint32_t & | SrcLocStrSize | ||
) |
Return the (LLVM-IR) string describing the source location identified by the arguments.
Definition at line 894 of file OMPIRBuilder.cpp.
References llvm::SmallString< InternalLen >::append(), getOrCreateSrcLocStr(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), and llvm::SmallString< InternalLen >::str().
Return the (LLVM-IR) string describing the source location LocStr
.
Definition at line 873 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateGlobalStringPtr(), llvm::Module::getContext(), llvm::ConstantExpr::getPointerCast(), llvm::ConstantDataArray::getString(), llvm::Module::globals(), M, llvm::StringRef::size(), and SrcLocStrMap.
Referenced by createBarrier(), createCachedThreadPrivate(), createCancel(), createCopyPrivate(), createCritical(), createMasked(), createMaster(), createOMPAlloc(), createOMPFree(), createOMPInteropDestroy(), createOMPInteropInit(), createOMPInteropUse(), createOrderedDepend(), createOrderedThreadsSimd(), createParallel(), createReductions(), createReductionsGPU(), createSingle(), createTargetData(), createTargetInit(), createTask(), createTaskgroup(), createTeams(), emitFlush(), emitTargetTask(), emitTaskwaitImpl(), emitTaskyieldImpl(), getOrCreateDefaultSrcLocStr(), and getOrCreateSrcLocStr().
Return the current thread ID.
Ident | The ident (ident_t*) describing the query origin. |
Definition at line 941 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateCall(), and getOrCreateRuntimeFunctionPtr().
Referenced by createBarrier(), createCachedThreadPrivate(), createCancel(), createCopyPrivate(), createCritical(), createMasked(), createMaster(), createOMPAlloc(), createOMPFree(), createOMPInteropDestroy(), createOMPInteropInit(), createOMPInteropUse(), createOrderedDepend(), createOrderedThreadsSimd(), createParallel(), createReductions(), createSingle(), createTask(), createTaskgroup(), createTeams(), emitTargetTask(), emitTaskwaitImpl(), and emitTaskyieldImpl().
Computes the size of type in bytes.
Definition at line 7253 of file OMPIRBuilder.cpp.
References Builder, llvm::IRBuilderBase::CreateGEP(), llvm::IRBuilderBase::CreatePtrToInt(), llvm::IRBuilderBase::getContext(), llvm::IRBuilderBase::getInt32(), llvm::Type::getInt64Ty(), llvm::Constant::getNullValue(), llvm::PointerType::getUnqual(), and llvm::Null.
|
static |
Creates a unique info for a target entry when provided a filename and line number from.
CallBack | A callback function which should return filename the entry resides in as well as the line number for the target entry |
ParentName | The name of the parent the target entry resides in, if any. |
Definition at line 8640 of file OMPIRBuilder.cpp.
References llvm::sys::fs::getUniqueID(), and llvm::report_fatal_error().
void OpenMPIRBuilder::initialize | ( | ) |
Initialize the internal state, this will put structures types and potentially other helpers into the underlying module.
Must be called before any other method and only once! This internal state includes types used in the OpenMPIRBuilder generated from OMPKinds.def.
Definition at line 649 of file OMPIRBuilder.cpp.
References M.
|
inline |
Return true if the last entry in the finalization stack is of kind DK
and cancellable.
Definition at line 2016 of file OMPIRBuilder.h.
References FinalizationStack.
Referenced by createBarrier(), and emitCancelationCheckImpl().
void OpenMPIRBuilder::loadOffloadInfoMetadata | ( | Module & | M | ) |
Loads all the offload entries information from the host IR metadata.
This function is only meant to be used with device code generation.
M | Module to load Metadata info from. Module passed maybe loaded from bitcode file, i.e, different from OpenMPIRBuilder::M module. |
Definition at line 8828 of file OMPIRBuilder.cpp.
References llvm::Module::getNamedMetadata(), Idx, llvm::OffloadEntriesInfoManager::initializeDeviceGlobalVarEntryInfo(), llvm::OffloadEntriesInfoManager::initializeTargetRegionEntryInfo(), llvm_unreachable, M, OffloadInfoManager, llvm::OffloadEntriesInfoManager::OffloadEntryInfo::OffloadingEntryInfoDeviceGlobalVar, llvm::OffloadEntriesInfoManager::OffloadEntryInfo::OffloadingEntryInfoTargetRegion, ompOffloadInfoName, and llvm::NamedMDNode::operands().
Referenced by loadOffloadInfoMetadata().
void OpenMPIRBuilder::loadOffloadInfoMetadata | ( | StringRef | HostFilePath | ) |
Loads all the offload entries information from the host IR metadata read from the file passed in as the HostFilePath argument.
This function is only meant to be used with device code generation.
HostFilePath | The path to the host IR file, used to load in offload metadata for the device, allowing host and device to maintain the same metadata mapping. |
Definition at line 8874 of file OMPIRBuilder.cpp.
References llvm::StringRef::empty(), llvm::expectedToErrorOrAndEmitErrors(), llvm::MemoryBuffer::getFile(), loadOffloadInfoMetadata(), M, llvm::parseBitcodeFile(), and llvm::report_fatal_error().
|
inline |
Pop the last finalization callback from the finalization stack.
NOTE: Temporary solution until Clang CG is gone.
Definition at line 544 of file OMPIRBuilder.h.
References FinalizationStack.
|
inline |
Push a finalization callback on the finalization stack.
NOTE: Temporary solution until Clang CG is gone.
Definition at line 537 of file OMPIRBuilder.h.
References FinalizationStack.
|
static |
Read/write a bounds on teams for Kernel
.
Read will return 0 if none is set.
Definition at line 6259 of file OMPIRBuilder.cpp.
References llvm::Function::getFnAttributeAsParsedInteger().
|
static |
}
Helpers to read/write kernel annotations from the IR.
{ Read/write a bounds on threads for Kernel
. Read will return 0 if none is set.
Definition at line 6218 of file OMPIRBuilder.cpp.
References llvm::Function::getFnAttribute(), llvm::Function::getFnAttributeAsParsedInteger(), getNVPTXMDNode(), llvm::Attribute::getValueAsString(), and llvm::StringRef::split().
void OpenMPIRBuilder::registerTargetGlobalVariable | ( | OffloadEntriesInfoManager::OMPTargetGlobalVarEntryKind | CaptureClause, |
OffloadEntriesInfoManager::OMPTargetDeviceClauseKind | DeviceClause, | ||
bool | IsDeclaration, | ||
bool | IsExternallyVisible, | ||
TargetRegionEntryInfo | EntryInfo, | ||
StringRef | MangledName, | ||
std::vector< GlobalVariable * > & | GeneratedRefs, | ||
bool | OpenMPSIMD, | ||
std::vector< Triple > | TargetTriple, | ||
std::function< Constant *()> | GlobalInitializer, | ||
std::function< GlobalValue::LinkageTypes()> | VariableLinkage, | ||
Type * | LlvmPtrTy, | ||
Constant * | Addr | ||
) |
Registers a target variable for device or host.
CaptureClause | - enumerator corresponding to the OpenMP capture clause used in conjunction with the variable being registered (link, to, enter). |
DeviceClause | - enumerator corresponding to the OpenMP capture clause used in conjunction with the variable being registered (nohost, host, any) |
IsDeclaration | - boolean stating if the variable being registered is a declaration-only and not a definition |
IsExternallyVisible | - boolean stating if the variable is externally visible |
EntryInfo | - Unique entry information for the value generated using getTargetEntryUniqueInfo, used to name generated pointer references to the declare target variable |
MangledName | - the mangled name of the variable being registered |
GeneratedRefs | - references generated by invocations of registerTargetGlobalVariable these are required by Clang for book keeping. |
OpenMPSIMD | - if OpenMP SIMD mode is currently enabled |
TargetTriple | - The OpenMP device target triple we are compiling for |
GlobalInitializer | - a lambda function which creates a constant used for initializing a pointer reference to the variable in certain cases. If a nullptr is passed, it will default to utilising the original variable to initialize the pointer reference. |
VariableLinkage | - a lambda function which returns the variables linkage type, if unspecified and a nullptr is given, it will instead utilise the linkage stored on the existing global variable in the LLVMModule. |
LlvmPtrTy | - The type of the variable we are generating or retrieving an address for |
Addr | - the original llvm value (addr) of the variable to be registered |
Definition at line 8747 of file OMPIRBuilder.cpp.
References Addr, Config, createPlatformSpecificName(), llvm::divideCeil(), llvm::omp::Flags, getAddrOfDeclareTargetVar(), llvm::Module::getDataLayout(), llvm::GlobalValue::getLinkage(), llvm::Module::getNamedValue(), getOrCreateInternalVariable(), llvm::DataLayout::getPointerSize(), llvm::DataLayout::getTypeSizeInBits(), llvm::GlobalValue::getValueType(), llvm::OffloadEntriesInfoManager::hasDeviceGlobalVarEntryInfo(), llvm::OpenMPIRBuilderConfig::hasRequiresUnifiedSharedMemory(), llvm::GlobalValue::InternalLinkage, llvm::OpenMPIRBuilderConfig::isTargetDevice(), llvm::GlobalValue::LinkOnceODRLinkage, M, OffloadInfoManager, llvm::OffloadEntriesInfoManager::OMPTargetDeviceClauseAny, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryEnter, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryLink, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryTo, llvm::OffloadEntriesInfoManager::registerDeviceGlobalVarEntryInfo(), and llvm::GlobalValue::WeakAnyLinkage.
Referenced by getAddrOfDeclareTargetVar().
Constant * OpenMPIRBuilder::registerTargetRegionFunction | ( | TargetRegionEntryInfo & | EntryInfo, |
Function * | OutlinedFunction, | ||
StringRef | EntryFnName, | ||
StringRef | EntryFnIDName | ||
) |
Registers the given function and sets up the attribtues of the function Returns the FunctionID.
InfoManager | The info manager keeping track of the offload entries |
EntryInfo | The entry information about the function |
OutlinedFunction | Pointer to the outlined function |
EntryFnName | Name of the outlined function |
EntryFnIDName | Name of the ID o be created |
Definition at line 6338 of file OMPIRBuilder.cpp.
References OffloadInfoManager, llvm::OffloadEntriesInfoManager::OMPTargetRegionEntryTargetRegion, and llvm::OffloadEntriesInfoManager::registerTargetRegionEntryInfo().
Referenced by emitTargetRegionFunction().
|
inline |
Definition at line 488 of file OMPIRBuilder.h.
References llvm::CallingConv::C, and Config.
void OpenMPIRBuilder::setCorrectMemberOfFlag | ( | omp::OpenMPOffloadMappingFlags & | Flags, |
omp::OpenMPOffloadMappingFlags | MemberOfFlag | ||
) |
Given an initial flag set, this function modifies it to contain the passed in MemberOfFlag generated from the getMemberOfFlag function.
The results are dependent on the existing flag bits set in the original flag set.
Flags | - The original set of flags to be modified with the passed in MemberOfFlag. |
MemberOfFlag | - A modified OMP_MAP_MEMBER_OF flag, adjusted slightly based on the getMemberOfFlag which adjusts the flag bits based on the members position in its parent. |
Definition at line 8672 of file OMPIRBuilder.cpp.
References llvm::omp::Flags, llvm::omp::OMP_MAP_MEMBER_OF, and llvm::omp::OMP_MAP_PTR_AND_OBJ.
std::vector< CanonicalLoopInfo * > OpenMPIRBuilder::tileLoops | ( | DebugLoc | DL, |
ArrayRef< CanonicalLoopInfo * > | Loops, | ||
ArrayRef< Value * > | TileSizes | ||
) |
Tile a loop nest.
Tiles the loops of Loops
by the tile sizes in TileSizes
. Loops in / Loops must be perfectly nested, from outermost to innermost loop (i.e. Loops.front() is the outermost loop). The trip count llvm::Value of every loop and every tile sizes must be usable in the outermost loop's preheader. This implies that the loop nest is rectangular.
Example:
After tiling with Loops={i,j} and TileSizes={5,7}, the loop is changed to
The returned vector are the loops {i1,j1,i2,j2}. The loops i1 and j1 are referred to the floor, and the loops i2 and j2 are the tiles. Tiling also handles non-constant trip counts, non-constant tile sizes and trip counts that are not multiples of the tile size. In the latter case the tile loop of the last floor-loop iteration will have fewer iterations than specified as its tile size.
DL | Debug location for instructions added by tiling, for instance the floor- and tile trip count computation. |
Loops | Loops to tile. The CanonicalLoopInfo objects are invalidated by this method, i.e. should not used after tiling. |
TileSizes | For each loop in Loops , the tile size for that dimensions. |
Definition at line 4843 of file OMPIRBuilder.cpp.
References assert(), Builder, llvm::Continue, llvm::IRBuilderBase::CreateAdd(), llvm::IRBuilderBase::CreateICmpEQ(), llvm::IRBuilderBase::CreateICmpNE(), createLoopSkeleton(), llvm::IRBuilderBase::CreateMul(), llvm::IRBuilderBase::CreateSelect(), llvm::IRBuilderBase::CreateUDiv(), llvm::IRBuilderBase::CreateURem(), llvm::IRBuilderBase::CreateZExt(), DL, llvm::SmallVectorImpl< T >::emplace_back(), llvm::enumerate(), F, llvm::CanonicalLoopInfo::getAfter(), llvm::CanonicalLoopInfo::getBody(), llvm::CanonicalLoopInfo::getExit(), llvm::CanonicalLoopInfo::getIndVar(), llvm::CanonicalLoopInfo::getLatch(), llvm::BasicBlock::getParent(), llvm::CanonicalLoopInfo::getPreheader(), llvm::CanonicalLoopInfo::getPreheaderIP(), llvm::BasicBlock::getTerminator(), llvm::Value::getType(), Loops, Name, P, llvm::SmallVectorTemplateBase< T, bool >::push_back(), redirectAllPredecessorsTo(), redirectTo(), removeUnusedBlocksFromParent(), llvm::Value::replaceAllUsesWith(), llvm::SmallVectorImpl< T >::reserve(), llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::SetCurrentDebugLocation(), llvm::IRBuilderBase::SetInsertPoint(), llvm::ArrayRef< T >::size(), Size, and TileSize.
Referenced by unrollLoopPartial().
void OpenMPIRBuilder::unrollLoopFull | ( | DebugLoc | DL, |
CanonicalLoopInfo * | Loop | ||
) |
Fully unroll a loop.
Instead of unrolling the loop immediately (and duplicating its body instructions), it is deferred to LLVM's LoopUnrollPass by adding loop metadata.
DL | Debug location for instructions added by unrolling. |
Loop | The loop to unroll. The loop will be invalidated. |
Definition at line 5082 of file OMPIRBuilder.cpp.
References addLoopMetadata(), Builder, llvm::MDNode::get(), llvm::MDString::get(), and llvm::IRBuilderBase::getContext().
void OpenMPIRBuilder::unrollLoopHeuristic | ( | DebugLoc | DL, |
CanonicalLoopInfo * | Loop | ||
) |
Fully or partially unroll a loop.
How the loop is unrolled is determined using LLVM's LoopUnrollPass.
DL | Debug location for instructions added by unrolling. |
Loop | The loop to unroll. The loop will be invalidated. |
Definition at line 5089 of file OMPIRBuilder.cpp.
References addLoopMetadata(), Builder, llvm::MDNode::get(), llvm::MDString::get(), and llvm::IRBuilderBase::getContext().
void OpenMPIRBuilder::unrollLoopPartial | ( | DebugLoc | DL, |
CanonicalLoopInfo * | Loop, | ||
int32_t | Factor, | ||
CanonicalLoopInfo ** | UnrolledCLI | ||
) |
Partially unroll a loop.
The CanonicalLoopInfo of the unrolled loop for use with chained loop-associated directive can be requested using UnrolledCLI
. Not needing the CanonicalLoopInfo allows more efficient code generation by deferring the actual unrolling to the LoopUnrollPass using loop metadata. A loop-associated directive applied to the unrolled loop needs to know the new trip count which means that if using a heuristically determined unroll factor (Factor
== 0), that factor must be computed immediately. We are using the same logic as the LoopUnrollPass to derived the unroll factor, but which assumes that some canonicalization has taken place (e.g. Mem2Reg, LICM, GVN, Inlining, etc.). That is, the heuristic will perform better when the unrolled loop's CanonicalLoopInfo is not needed.
DL | Debug location for instructions added by unrolling. |
Loop | The loop to unroll. The loop will be invalidated. |
Factor | The factor to unroll the loop by. A factor of 0 indicates that a heuristic should be used to determine the unroll-factor. |
UnrolledCLI | If non-null, receives the CanonicalLoopInfo of the partially unrolled loop. Otherwise, uses loop metadata to defer unrolling to the LoopUnrollPass. |
Definition at line 5444 of file OMPIRBuilder.cpp.
References addLoopMetadata(), assert(), computeHeuristicUnrollFactor(), DL, F, llvm::ConstantAsMetadata::get(), llvm::MDNode::get(), llvm::MDString::get(), llvm::Type::getInt32Ty(), llvm::Type::getIntegerBitWidth(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), and tileLoops().
|
inline |
Update the internal location to Loc
.
Definition at line 1935 of file OMPIRBuilder.h.
References Builder, llvm::OpenMPIRBuilder::LocationDescription::DL, llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::IRBuilderBase::restoreIP(), and llvm::IRBuilderBase::SetCurrentDebugLocation().
Referenced by createAtomicCapture(), createAtomicCompare(), createAtomicRead(), createAtomicUpdate(), createAtomicWrite(), createBarrier(), createCachedThreadPrivate(), createCancel(), createCanonicalLoop(), createCopyPrivate(), createCritical(), createFlush(), createMapperAllocas(), createMasked(), createMaster(), createOMPAlloc(), createOMPFree(), createOMPInteropDestroy(), createOMPInteropInit(), createOMPInteropUse(), createOrderedDepend(), createOrderedThreadsSimd(), createParallel(), createReductions(), createReductionsGPU(), createSection(), createSections(), createSingle(), createTarget(), createTargetData(), createTargetDeinit(), createTargetInit(), createTask(), createTaskgroup(), createTaskwait(), createTaskyield(), createTeams(), emitKernelLaunch(), emitMapperCall(), and emitTargetKernel().
|
static |
Definition at line 6264 of file OMPIRBuilder.cpp.
References llvm::Function::addFnAttr(), and updateNVPTXMetadata().
Referenced by createTargetInit().
|
static |
Definition at line 6244 of file OMPIRBuilder.cpp.
References llvm::Function::addFnAttr(), and updateNVPTXMetadata().
Referenced by createTargetInit().
IRBuilder llvm::OpenMPIRBuilder::Builder |
The LLVM-IR Builder used to create IR.
Definition at line 2044 of file OMPIRBuilder.h.
Referenced by applySimd(), collapseLoops(), createAtomicCapture(), createAtomicCompare(), createAtomicRead(), createAtomicUpdate(), createAtomicWrite(), createBarrier(), createCachedThreadPrivate(), createCancel(), createCanonicalLoop(), createCopyinClauseBlocks(), createCopyPrivate(), createCritical(), createLoopSkeleton(), createMapperAllocas(), createMasked(), createMaster(), createOffloadEntriesAndInfoMetadata(), createOMPAlloc(), createOMPFree(), createOMPInteropDestroy(), createOMPInteropInit(), createOMPInteropUse(), createOrderedDepend(), createOrderedThreadsSimd(), createParallel(), createReductions(), createReductionsGPU(), createSection(), createSections(), createSingle(), createTarget(), createTargetData(), createTargetDeinit(), createTargetInit(), createTargetLoopWorkshareCall(), createTask(), createTaskgroup(), createTeams(), emitBlock(), emitBranch(), emitCancelationCheckImpl(), emitFlush(), emitIfClause(), emitKernelLaunch(), emitMapperCall(), emitNonContiguousDescriptor(), emitOffloadingArrays(), emitOffloadingArraysAndArgs(), emitOffloadingArraysArgument(), emitTargetKernel(), emitTargetTask(), emitTaskDependencies(), emitTaskwaitImpl(), emitTaskyieldImpl(), finalize(), getInsertionPoint(), getKernelArgsVector(), getOrCreateSrcLocStr(), getOrCreateThreadID(), getSizeInBytes(), hostParallelCallback(), targetParallelCallback(), tileLoops(), unrollLoopFull(), unrollLoopHeuristic(), updateToLocation(), and workshareLoopTargetCallback().
OpenMPIRBuilderConfig llvm::OpenMPIRBuilder::Config |
The OpenMPIRBuilder Configuration.
Definition at line 2038 of file OMPIRBuilder.h.
Referenced by applyWorkshareLoop(), createOffloadEntriesAndInfoMetadata(), createOffloadEntry(), createOutlinedFunction(), createParallel(), createPlatformSpecificName(), createReductionsGPU(), createTarget(), createTargetData(), createTeams(), emitTargetRegionFunction(), finalize(), getAddrOfDeclareTargetVar(), llvm::OffloadEntriesInfoManager::registerDeviceGlobalVarEntryInfo(), registerTargetGlobalVariable(), llvm::OffloadEntriesInfoManager::registerTargetRegionEntryInfo(), and setConfig().
SmallVector<llvm::Function *, 16> llvm::OpenMPIRBuilder::ConstantAllocaRaiseCandidates |
A collection of candidate target functions that's constant allocas will attempt to be raised on a call of finalize after all currently enqueued outline info's have been processed.
Definition at line 2081 of file OMPIRBuilder.h.
Referenced by createOutlinedFunction(), and finalize().
SmallVector<FinalizationInfo, 8> llvm::OpenMPIRBuilder::FinalizationStack |
The finalization stack made up of finalize callbacks currently in-flight, wrapped into FinalizationInfo objects that reference also the finalization target block and the kind of cancellable directive.
Definition at line 2012 of file OMPIRBuilder.h.
Referenced by createParallel(), createSections(), emitCancelationCheckImpl(), isLastFinalizationInfoCancellable(), popFinalizationCB(), and pushFinalizationCB().
Map to remember existing ident_t*.
Definition at line 2050 of file OMPIRBuilder.h.
Referenced by getOrCreateIdent().
StringMap<GlobalVariable *, BumpPtrAllocator> llvm::OpenMPIRBuilder::InternalVars |
An ordered map of auto-generated variables to their unique names.
It stores variables with the following names: 1) ".gomp_critical_user_" + <critical_section_name> + ".var" for "omp critical" directives; 2) <mangled_name_for_global_var> + ".cache." for cache for threadprivate variables.
Definition at line 2095 of file OMPIRBuilder.h.
Referenced by getOrCreateInternalVariable().
std::forward_list<CanonicalLoopInfo> llvm::OpenMPIRBuilder::LoopInfos |
Collection of owned canonical loop objects that eventually need to be free'd.
Definition at line 2085 of file OMPIRBuilder.h.
Referenced by createLoopSkeleton().
Module& llvm::OpenMPIRBuilder::M |
The underlying LLVM-IR module.
Definition at line 2041 of file OMPIRBuilder.h.
Referenced by createAtomicCompare(), createAtomicRead(), createAtomicWrite(), createCopyinClauseBlocks(), createDispatchDeinitFunction(), createDispatchFiniFunction(), createDispatchInitFunction(), createDispatchNextFunction(), createForStaticInitFunction(), createGlobalFlag(), createLoopSkeleton(), createOffloadEntriesAndInfoMetadata(), createOffloadEntry(), createOffloadMapnames(), createOffloadMaptypes(), createOMPInteropDestroy(), createOMPInteropInit(), createOMPInteropUse(), createParallel(), createReductionsGPU(), createSections(), createTargetDeinit(), createTargetInit(), createTargetLoopWorkshareCall(), createTask(), emitIfClause(), emitNonContiguousDescriptor(), emitOffloadingArrays(), emitOffloadingArraysArgument(), emitTargetKernel(), emitTargetTask(), emitTargetTaskProxyFunction(), emitTaskDependencies(), finalize(), getAddrOfDeclareTargetVar(), getKmpcForStaticLoopForType(), getOrCreateIdent(), getOrCreateInternalVariable(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), initialize(), loadOffloadInfoMetadata(), and registerTargetGlobalVariable().
OffloadEntriesInfoManager llvm::OpenMPIRBuilder::OffloadInfoManager |
Info manager to keep track of target regions.
Definition at line 2053 of file OMPIRBuilder.h.
Referenced by createOffloadEntriesAndInfoMetadata(), emitTargetRegionFunction(), finalize(), loadOffloadInfoMetadata(), registerTargetGlobalVariable(), and registerTargetRegionFunction().
const std::string llvm::OpenMPIRBuilder::ompOffloadInfoName = "omp_offload.info" |
OMP Offload Info Metadata name string.
Definition at line 3227 of file OMPIRBuilder.h.
Referenced by loadOffloadInfoMetadata().
SmallVector<OutlineInfo, 16> llvm::OpenMPIRBuilder::OutlineInfos |
Collection of regions that need to be outlined during finalization.
Definition at line 2076 of file OMPIRBuilder.h.
Referenced by addOutlineInfo(), finalize(), and ~OpenMPIRBuilder().
Map to remember source location strings.
Definition at line 2047 of file OMPIRBuilder.h.
Referenced by getOrCreateSrcLocStr().
The target triple of the underlying module.
Definition at line 2056 of file OMPIRBuilder.h.