LLVM 20.0.0git
Classes | Public Types | Public Member Functions | Static Public Member Functions | Public Attributes | List of all members
llvm::OpenMPIRBuilder Class Reference

An interface to create LLVM-IR for OpenMP directives. More...

#include "llvm/Frontend/OpenMP/OMPIRBuilder.h"

Classes

class  AtomicInfo
 
struct  AtomicOpValue
 a struct to pack relevant information while generating atomic Ops More...
 
struct  CopyOptionsTy
 
struct  DependData
 A struct to pack the relevant information for an OpenMP depend clause. More...
 
struct  FinalizationInfo
 
struct  LocationDescription
 Description of a LLVM-IR insertion point (IP) and a debug/source location (filename, line, column, ...). More...
 
struct  MapInfosTy
 This structure contains combined information generated for mappable clauses, including base pointers, pointers, sizes, map types, user-defined mappers, and non-contiguous information. More...
 
struct  MapperAllocas
 
struct  OutlineInfo
 Helper that contains information about regions we need to outline during finalization. More...
 
struct  ReductionInfo
 Information about an OpenMP reduction. More...
 
class  TargetDataInfo
 Struct that keeps the information that should be kept throughout a 'target data' region. More...
 
struct  TargetDataRTArgs
 Container for the arguments used to pass data to the runtime library. More...
 
struct  TargetKernelArgs
 Data structure that contains the needed information to construct the kernel args vector. More...
 

Public Types

enum class  ReductionGenCBKind { Clang , MLIR }
 Enum class for the RedctionGen CallBack type to be used. More...
 
enum class  EvalKind { Scalar , Complex , Aggregate }
 Enum class for reduction evaluation types scalar, complex and aggregate. More...
 
enum class  CopyAction : unsigned { RemoteLaneToThread , ThreadCopy }
 
enum class  DeviceInfoTy { None , Pointer , Address }
 
enum  EmitMetadataErrorKind { EMIT_MD_TARGET_REGION_ERROR , EMIT_MD_DECLARE_TARGET_ERROR , EMIT_MD_GLOBAL_VAR_LINK_ERROR }
 The kind of errors that can occur when emitting the offload entries and metadata. More...
 
enum  BodyGenTy { Priv , DupNoPriv , NoPriv }
 Type of BodyGen to use for region codegen. More...
 
using InsertPointTy = IRBuilder<>::InsertPoint
 Type used throughout for insertion points.
 
using InsertPointOrErrorTy = Expected< InsertPointTy >
 Type used to represent an insertion point or an error value.
 
using FinalizeCallbackTy = std::function< Error(InsertPointTy CodeGenIP)>
 Callback type for variable finalization (think destructors).
 
using BodyGenCallbackTy = function_ref< Error(InsertPointTy AllocaIP, InsertPointTy CodeGenIP)>
 Callback type for body (=inner region) code generation.
 
using StorableBodyGenCallbackTy = std::function< Error(InsertPointTy AllocaIP, InsertPointTy CodeGenIP)>
 
using LoopBodyGenCallbackTy = function_ref< Error(InsertPointTy CodeGenIP, Value *IndVar)>
 Callback type for loop body code generation.
 
using PrivatizeCallbackTy = function_ref< InsertPointOrErrorTy(InsertPointTy AllocaIP, InsertPointTy CodeGenIP, Value &Original, Value &Inner, Value *&ReplVal)>
 Callback type for variable privatization (think copy & default constructor).
 
using FileIdentifierInfoCallbackTy = std::function< std::tuple< std::string, uint64_t >()>
 
using ReductionGenClangCBTy = std::function< InsertPointTy(InsertPointTy CodeGenIP, unsigned Index, Value **LHS, Value **RHS, Function *CurFn)>
 ReductionGen CallBack for Clang.
 
using ReductionGenCBTy = std::function< InsertPointOrErrorTy(InsertPointTy CodeGenIP, Value *LHS, Value *RHS, Value *&Res)>
 ReductionGen CallBack for MLIR.
 
using ReductionGenAtomicCBTy = std::function< InsertPointOrErrorTy(InsertPointTy, Type *, Value *, Value *)>
 Functions used to generate atomic reductions.
 
using MapValuesArrayTy = SmallVector< Value *, 4 >
 
using MapDeviceInfoArrayTy = SmallVector< DeviceInfoTy, 4 >
 
using MapFlagsArrayTy = SmallVector< omp::OpenMPOffloadMappingFlags, 4 >
 
using MapNamesArrayTy = SmallVector< Constant *, 4 >
 
using MapDimArrayTy = SmallVector< uint64_t, 4 >
 
using MapNonContiguousArrayTy = SmallVector< MapValuesArrayTy, 4 >
 
using EmitFallbackCallbackTy = function_ref< InsertPointOrErrorTy(InsertPointTy)>
 Callback function type for functions emitting the host fallback code that is executed when the kernel launch fails.
 
using TargetTaskBodyCallbackTy = function_ref< Error(Value *DeviceID, Value *RTLoc, IRBuilderBase::InsertPoint TargetTaskAllocaIP)>
 Callback type for generating the bodies of device directives that require outer target tasks (e.g.
 
using EmitMetadataErrorReportFunctionTy = std::function< void(EmitMetadataErrorKind, TargetRegionEntryInfo)>
 Callback function type.
 
using FunctionGenCallback = std::function< Expected< Function * >(StringRef FunctionName)>
 Functions used to generate a function with the given name.
 
using GenMapInfoCallbackTy = function_ref< MapInfosTy &(InsertPointTy CodeGenIP)>
 Callback type for creating the map infos for the kernel parameters.
 
using TargetBodyGenCallbackTy = function_ref< InsertPointOrErrorTy(InsertPointTy AllocaIP, InsertPointTy CodeGenIP)>
 
using TargetGenArgAccessorsCallbackTy = function_ref< InsertPointOrErrorTy(Argument &Arg, Value *Input, Value *&RetVal, InsertPointTy AllocaIP, InsertPointTy CodeGenIP)>
 

Public Member Functions

 OpenMPIRBuilder (Module &M)
 Create a new OpenMPIRBuilder operating on the given module M.
 
 ~OpenMPIRBuilder ()
 
void initialize ()
 Initialize the internal state, this will put structures types and potentially other helpers into the underlying module.
 
void setConfig (OpenMPIRBuilderConfig C)
 
void finalize (Function *Fn=nullptr)
 Finalize the underlying module, e.g., by outlining regions.
 
void addAttributes (omp::RuntimeFunction FnID, Function &Fn)
 Add attributes known for FnID to Fn.
 
std::string createPlatformSpecificName (ArrayRef< StringRef > Parts) const
 Get the create a name using the platform specific separators.
 
void pushFinalizationCB (const FinalizationInfo &FI)
 Push a finalization callback on the finalization stack.
 
void popFinalizationCB ()
 Pop the last finalization callback from the finalization stack.
 
InsertPointOrErrorTy createBarrier (const LocationDescription &Loc, omp::Directive Kind, bool ForceSimpleCall=false, bool CheckCancelFlag=true)
 Emitter methods for OpenMP directives.
 
InsertPointOrErrorTy createCancel (const LocationDescription &Loc, Value *IfCondition, omp::Directive CanceledDirective)
 Generator for '#omp cancel'.
 
InsertPointOrErrorTy createParallel (const LocationDescription &Loc, InsertPointTy AllocaIP, BodyGenCallbackTy BodyGenCB, PrivatizeCallbackTy PrivCB, FinalizeCallbackTy FiniCB, Value *IfCondition, Value *NumThreads, omp::ProcBindKind ProcBind, bool IsCancellable)
 Generator for '#omp parallel'.
 
Expected< CanonicalLoopInfo * > createCanonicalLoop (const LocationDescription &Loc, LoopBodyGenCallbackTy BodyGenCB, Value *TripCount, const Twine &Name="loop")
 Generator for the control flow structure of an OpenMP canonical loop.
 
Expected< CanonicalLoopInfo * > createCanonicalLoop (const LocationDescription &Loc, LoopBodyGenCallbackTy BodyGenCB, Value *Start, Value *Stop, Value *Step, bool IsSigned, bool InclusiveStop, InsertPointTy ComputeIP={}, const Twine &Name="loop")
 Generator for the control flow structure of an OpenMP canonical loop.
 
CanonicalLoopInfocollapseLoops (DebugLoc DL, ArrayRef< CanonicalLoopInfo * > Loops, InsertPointTy ComputeIP)
 Collapse a loop nest into a single loop.
 
ConstantgetAddrOfDeclareTargetVar (OffloadEntriesInfoManager::OMPTargetGlobalVarEntryKind CaptureClause, OffloadEntriesInfoManager::OMPTargetDeviceClauseKind DeviceClause, bool IsDeclaration, bool IsExternallyVisible, TargetRegionEntryInfo EntryInfo, StringRef MangledName, std::vector< GlobalVariable * > &GeneratedRefs, bool OpenMPSIMD, std::vector< Triple > TargetTriple, Type *LlvmPtrTy, std::function< Constant *()> GlobalInitializer, std::function< GlobalValue::LinkageTypes()> VariableLinkage)
 Retrieve (or create if non-existent) the address of a declare target variable, used in conjunction with registerTargetGlobalVariable to create declare target global variables.
 
void registerTargetGlobalVariable (OffloadEntriesInfoManager::OMPTargetGlobalVarEntryKind CaptureClause, OffloadEntriesInfoManager::OMPTargetDeviceClauseKind DeviceClause, bool IsDeclaration, bool IsExternallyVisible, TargetRegionEntryInfo EntryInfo, StringRef MangledName, std::vector< GlobalVariable * > &GeneratedRefs, bool OpenMPSIMD, std::vector< Triple > TargetTriple, std::function< Constant *()> GlobalInitializer, std::function< GlobalValue::LinkageTypes()> VariableLinkage, Type *LlvmPtrTy, Constant *Addr)
 Registers a target variable for device or host.
 
unsigned getFlagMemberOffset ()
 Get the offset of the OMP_MAP_MEMBER_OF field.
 
omp::OpenMPOffloadMappingFlags getMemberOfFlag (unsigned Position)
 Get OMP_MAP_MEMBER_OF flag with extra bits reserved based on the position given.
 
void setCorrectMemberOfFlag (omp::OpenMPOffloadMappingFlags &Flags, omp::OpenMPOffloadMappingFlags MemberOfFlag)
 Given an initial flag set, this function modifies it to contain the passed in MemberOfFlag generated from the getMemberOfFlag function.
 
InsertPointOrErrorTy applyWorkshareLoop (DebugLoc DL, CanonicalLoopInfo *CLI, InsertPointTy AllocaIP, bool NeedsBarrier, llvm::omp::ScheduleKind SchedKind=llvm::omp::OMP_SCHEDULE_Default, Value *ChunkSize=nullptr, bool HasSimdModifier=false, bool HasMonotonicModifier=false, bool HasNonmonotonicModifier=false, bool HasOrderedClause=false, omp::WorksharingLoopType LoopType=omp::WorksharingLoopType::ForStaticLoop)
 Modifies the canonical loop to be a workshare loop.
 
std::vector< CanonicalLoopInfo * > tileLoops (DebugLoc DL, ArrayRef< CanonicalLoopInfo * > Loops, ArrayRef< Value * > TileSizes)
 Tile a loop nest.
 
void unrollLoopFull (DebugLoc DL, CanonicalLoopInfo *Loop)
 Fully unroll a loop.
 
void unrollLoopHeuristic (DebugLoc DL, CanonicalLoopInfo *Loop)
 Fully or partially unroll a loop.
 
void unrollLoopPartial (DebugLoc DL, CanonicalLoopInfo *Loop, int32_t Factor, CanonicalLoopInfo **UnrolledCLI)
 Partially unroll a loop.
 
void applySimd (CanonicalLoopInfo *Loop, MapVector< Value *, Value * > AlignedVars, Value *IfCond, omp::OrderKind Order, ConstantInt *Simdlen, ConstantInt *Safelen)
 Add metadata to simd-ize a loop.
 
void createFlush (const LocationDescription &Loc)
 Generator for '#omp flush'.
 
void createTaskwait (const LocationDescription &Loc)
 Generator for '#omp taskwait'.
 
void createTaskyield (const LocationDescription &Loc)
 Generator for '#omp taskyield'.
 
InsertPointOrErrorTy createTask (const LocationDescription &Loc, InsertPointTy AllocaIP, BodyGenCallbackTy BodyGenCB, bool Tied=true, Value *Final=nullptr, Value *IfCondition=nullptr, SmallVector< DependData > Dependencies={}, bool Mergeable=false, Value *EventHandle=nullptr)
 Generator for #omp task
 
InsertPointOrErrorTy createTaskgroup (const LocationDescription &Loc, InsertPointTy AllocaIP, BodyGenCallbackTy BodyGenCB)
 Generator for the taskgroup construct.
 
InsertPointOrErrorTy createReductionsGPU (const LocationDescription &Loc, InsertPointTy AllocaIP, InsertPointTy CodeGenIP, ArrayRef< ReductionInfo > ReductionInfos, bool IsNoWait=false, bool IsTeamsReduction=false, bool HasDistribute=false, ReductionGenCBKind ReductionGenCBKind=ReductionGenCBKind::MLIR, std::optional< omp::GV > GridValue={}, unsigned ReductionBufNum=1024, Value *SrcLocInfo=nullptr)
 Design of OpenMP reductions on the GPU.
 
InsertPointOrErrorTy createReductions (const LocationDescription &Loc, InsertPointTy AllocaIP, ArrayRef< ReductionInfo > ReductionInfos, ArrayRef< bool > IsByRef, bool IsNoWait=false)
 Generator for '#omp reduction'.
 
InsertPointTy getInsertionPoint ()
 }
 
bool updateToLocation (const LocationDescription &Loc)
 Update the internal location to Loc.
 
FunctionCallee getOrCreateRuntimeFunction (Module &M, omp::RuntimeFunction FnID)
 Return the function declaration for the runtime function with FnID.
 
FunctiongetOrCreateRuntimeFunctionPtr (omp::RuntimeFunction FnID)
 
ConstantgetOrCreateSrcLocStr (StringRef LocStr, uint32_t &SrcLocStrSize)
 Return the (LLVM-IR) string describing the source location LocStr.
 
ConstantgetOrCreateDefaultSrcLocStr (uint32_t &SrcLocStrSize)
 Return the (LLVM-IR) string describing the default source location.
 
ConstantgetOrCreateSrcLocStr (StringRef FunctionName, StringRef FileName, unsigned Line, unsigned Column, uint32_t &SrcLocStrSize)
 Return the (LLVM-IR) string describing the source location identified by the arguments.
 
ConstantgetOrCreateSrcLocStr (DebugLoc DL, uint32_t &SrcLocStrSize, Function *F=nullptr)
 Return the (LLVM-IR) string describing the DebugLoc DL.
 
ConstantgetOrCreateSrcLocStr (const LocationDescription &Loc, uint32_t &SrcLocStrSize)
 Return the (LLVM-IR) string describing the source location Loc.
 
ConstantgetOrCreateIdent (Constant *SrcLocStr, uint32_t SrcLocStrSize, omp::IdentFlag Flags=omp::IdentFlag(0), unsigned Reserve2Flags=0)
 Return an ident_t* encoding the source location SrcLocStr and Flags.
 
GlobalValuecreateGlobalFlag (unsigned Value, StringRef Name)
 Create a hidden global flag Name in the module with initial value Value.
 
Error emitCancelationCheckImpl (Value *CancelFlag, omp::Directive CanceledDirective, FinalizeCallbackTy ExitCB={})
 Generate control flow and cleanup for cancellation.
 
InsertPointTy emitTargetKernel (const LocationDescription &Loc, InsertPointTy AllocaIP, Value *&Return, Value *Ident, Value *DeviceID, Value *NumTeams, Value *NumThreads, Value *HostPtr, ArrayRef< Value * > KernelArgs)
 Generate a target region entry call.
 
void emitFlush (const LocationDescription &Loc)
 Generate a flush runtime call.
 
bool isLastFinalizationInfoCancellable (omp::Directive DK)
 Return true if the last entry in the finalization stack is of kind DK and cancellable.
 
void emitTaskwaitImpl (const LocationDescription &Loc)
 Generate a taskwait runtime call.
 
void emitTaskyieldImpl (const LocationDescription &Loc)
 Generate a taskyield runtime call.
 
ValuegetOrCreateThreadID (Value *Ident)
 Return the current thread ID.
 
void addOutlineInfo (OutlineInfo &&OI)
 Add a new region that will be outlined later.
 
ValuegetSizeInBytes (Value *BasePtr)
 Computes the size of type in bytes.
 
void emitBranch (BasicBlock *Target)
 
void emitBlock (BasicBlock *BB, Function *CurFn, bool IsFinished=false)
 
Error emitIfClause (Value *Cond, BodyGenCallbackTy ThenGen, BodyGenCallbackTy ElseGen, InsertPointTy AllocaIP={})
 Emits code for OpenMP 'if' clause using specified BodyGenCallbackTy Here is the logic: if (Cond) { ThenGen(); } else { ElseGen(); }.
 
GlobalVariablecreateOffloadMaptypes (SmallVectorImpl< uint64_t > &Mappings, std::string VarName)
 Create the global variable holding the offload mappings information.
 
GlobalVariablecreateOffloadMapnames (SmallVectorImpl< llvm::Constant * > &Names, std::string VarName)
 Create the global variable holding the offload names information.
 
void createMapperAllocas (const LocationDescription &Loc, InsertPointTy AllocaIP, unsigned NumOperands, struct MapperAllocas &MapperAllocas)
 Create the allocas instruction used in call to mapper functions.
 
void emitMapperCall (const LocationDescription &Loc, Function *MapperFunc, Value *SrcLocInfo, Value *MaptypesArg, Value *MapnamesArg, struct MapperAllocas &MapperAllocas, int64_t DeviceID, unsigned NumOperands)
 Create the call for the target mapper function.
 
InsertPointOrErrorTy emitKernelLaunch (const LocationDescription &Loc, Value *OutlinedFnID, EmitFallbackCallbackTy EmitTargetCallFallbackCB, TargetKernelArgs &Args, Value *DeviceID, Value *RTLoc, InsertPointTy AllocaIP)
 Generate a target region entry call and host fallback call.
 
InsertPointOrErrorTy emitTargetTask (TargetTaskBodyCallbackTy TaskBodyCB, Value *DeviceID, Value *RTLoc, OpenMPIRBuilder::InsertPointTy AllocaIP, const SmallVector< llvm::OpenMPIRBuilder::DependData > &Dependencies, bool HasNoWait)
 Generate a target-task for the target construct.
 
void emitOffloadingArraysArgument (IRBuilderBase &Builder, OpenMPIRBuilder::TargetDataRTArgs &RTArgs, OpenMPIRBuilder::TargetDataInfo &Info, bool ForEndCall=false)
 Emit the arguments to be passed to the runtime library based on the arrays of base pointers, pointers, sizes, map types, and mappers.
 
void emitNonContiguousDescriptor (InsertPointTy AllocaIP, InsertPointTy CodeGenIP, MapInfosTy &CombinedInfo, TargetDataInfo &Info)
 Emit an array of struct descriptors to be assigned to the offload args.
 
void emitOffloadingArrays (InsertPointTy AllocaIP, InsertPointTy CodeGenIP, MapInfosTy &CombinedInfo, TargetDataInfo &Info, bool IsNonContiguous=false, function_ref< void(unsigned int, Value *)> DeviceAddrCB=nullptr, function_ref< Value *(unsigned int)> CustomMapperCB=nullptr)
 Emit the arrays used to pass the captures and map information to the offloading runtime library.
 
void emitOffloadingArraysAndArgs (InsertPointTy AllocaIP, InsertPointTy CodeGenIP, TargetDataInfo &Info, TargetDataRTArgs &RTArgs, MapInfosTy &CombinedInfo, bool IsNonContiguous=false, bool ForEndCall=false, function_ref< void(unsigned int, Value *)> DeviceAddrCB=nullptr, function_ref< Value *(unsigned int)> CustomMapperCB=nullptr)
 Allocates memory for and populates the arrays required for offloading (offload_{baseptrs|ptrs|mappers|sizes|maptypes|mapnames}).
 
void createOffloadEntry (Constant *ID, Constant *Addr, uint64_t Size, int32_t Flags, GlobalValue::LinkageTypes, StringRef Name="")
 Creates offloading entry for the provided entry ID ID, address Addr, size Size, and flags Flags.
 
void createOffloadEntriesAndInfoMetadata (EmitMetadataErrorReportFunctionTy &ErrorReportFunction)
 
InsertPointTy createCopyPrivate (const LocationDescription &Loc, llvm::Value *BufSize, llvm::Value *CpyBuf, llvm::Value *CpyFn, llvm::Value *DidIt)
 Generator for __kmpc_copyprivate.
 
InsertPointOrErrorTy createSingle (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB, bool IsNowait, ArrayRef< llvm::Value * > CPVars={}, ArrayRef< llvm::Function * > CPFuncs={})
 Generator for '#omp single'.
 
InsertPointOrErrorTy createMaster (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB)
 Generator for '#omp master'.
 
InsertPointOrErrorTy createMasked (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB, Value *Filter)
 Generator for '#omp masked'.
 
InsertPointOrErrorTy createCritical (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB, StringRef CriticalName, Value *HintInst)
 Generator for '#omp critical'.
 
InsertPointTy createOrderedDepend (const LocationDescription &Loc, InsertPointTy AllocaIP, unsigned NumLoops, ArrayRef< llvm::Value * > StoreValues, const Twine &Name, bool IsDependSource)
 Generator for '#omp ordered depend (source | sink)'.
 
InsertPointOrErrorTy createOrderedThreadsSimd (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB, bool IsThreads)
 Generator for '#omp ordered [threads | simd]'.
 
InsertPointOrErrorTy createSections (const LocationDescription &Loc, InsertPointTy AllocaIP, ArrayRef< StorableBodyGenCallbackTy > SectionCBs, PrivatizeCallbackTy PrivCB, FinalizeCallbackTy FiniCB, bool IsCancellable, bool IsNowait)
 Generator for '#omp sections'.
 
InsertPointOrErrorTy createSection (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, FinalizeCallbackTy FiniCB)
 Generator for '#omp section'.
 
InsertPointOrErrorTy createTeams (const LocationDescription &Loc, BodyGenCallbackTy BodyGenCB, Value *NumTeamsLower=nullptr, Value *NumTeamsUpper=nullptr, Value *ThreadLimit=nullptr, Value *IfExpr=nullptr)
 Generator for #omp teams
 
InsertPointTy createCopyinClauseBlocks (InsertPointTy IP, Value *MasterAddr, Value *PrivateAddr, llvm::IntegerType *IntPtrTy, bool BranchtoEnd=true)
 Generate conditional branch and relevant BasicBlocks through which private threads copy the 'copyin' variables from Master copy to threadprivate copies.
 
CallInstcreateOMPAlloc (const LocationDescription &Loc, Value *Size, Value *Allocator, std::string Name="")
 Create a runtime call for kmpc_Alloc.
 
CallInstcreateOMPFree (const LocationDescription &Loc, Value *Addr, Value *Allocator, std::string Name="")
 Create a runtime call for kmpc_free.
 
CallInstcreateCachedThreadPrivate (const LocationDescription &Loc, llvm::Value *Pointer, llvm::ConstantInt *Size, const llvm::Twine &Name=Twine(""))
 Create a runtime call for kmpc_threadprivate_cached.
 
CallInstcreateOMPInteropInit (const LocationDescription &Loc, Value *InteropVar, omp::OMPInteropType InteropType, Value *Device, Value *NumDependences, Value *DependenceAddress, bool HaveNowaitClause)
 Create a runtime call for __tgt_interop_init.
 
CallInstcreateOMPInteropDestroy (const LocationDescription &Loc, Value *InteropVar, Value *Device, Value *NumDependences, Value *DependenceAddress, bool HaveNowaitClause)
 Create a runtime call for __tgt_interop_destroy.
 
CallInstcreateOMPInteropUse (const LocationDescription &Loc, Value *InteropVar, Value *Device, Value *NumDependences, Value *DependenceAddress, bool HaveNowaitClause)
 Create a runtime call for __tgt_interop_use.
 
InsertPointTy createTargetInit (const LocationDescription &Loc, bool IsSPMD, int32_t MinThreadsVal=0, int32_t MaxThreadsVal=0, int32_t MinTeamsVal=0, int32_t MaxTeamsVal=0)
 The omp target interface.
 
void createTargetDeinit (const LocationDescription &Loc, int32_t TeamsReductionDataSize=0, int32_t TeamsReductionBufferLength=1024)
 Create a runtime call for kmpc_target_deinit.
 
Error emitTargetRegionFunction (TargetRegionEntryInfo &EntryInfo, FunctionGenCallback &GenerateFunctionCallback, bool IsOffloadEntry, Function *&OutlinedFn, Constant *&OutlinedFnID)
 Create a unique name for the entry function using the source location information of the current target region.
 
ConstantregisterTargetRegionFunction (TargetRegionEntryInfo &EntryInfo, Function *OutlinedFunction, StringRef EntryFnName, StringRef EntryFnIDName)
 Registers the given function and sets up the attribtues of the function Returns the FunctionID.
 
FunctionemitUserDefinedMapper (function_ref< MapInfosTy &(InsertPointTy CodeGenIP, llvm::Value *PtrPHI, llvm::Value *BeginArg)> PrivAndGenMapInfoCB, llvm::Type *ElemTy, StringRef FuncName, function_ref< bool(unsigned int, Function **)> CustomMapperCB=nullptr)
 Emit the user-defined mapper function.
 
InsertPointOrErrorTy createTargetData (const LocationDescription &Loc, InsertPointTy AllocaIP, InsertPointTy CodeGenIP, Value *DeviceID, Value *IfCond, TargetDataInfo &Info, GenMapInfoCallbackTy GenMapInfoCB, omp::RuntimeFunction *MapperFunc=nullptr, function_ref< InsertPointOrErrorTy(InsertPointTy CodeGenIP, BodyGenTy BodyGenType)> BodyGenCB=nullptr, function_ref< void(unsigned int, Value *)> DeviceAddrCB=nullptr, function_ref< Value *(unsigned int)> CustomMapperCB=nullptr, Value *SrcLocInfo=nullptr)
 Generator for '#omp target data'.
 
InsertPointOrErrorTy createTarget (const LocationDescription &Loc, bool IsOffloadEntry, OpenMPIRBuilder::InsertPointTy AllocaIP, OpenMPIRBuilder::InsertPointTy CodeGenIP, TargetRegionEntryInfo &EntryInfo, ArrayRef< int32_t > NumTeams, ArrayRef< int32_t > NumThreads, SmallVectorImpl< Value * > &Inputs, GenMapInfoCallbackTy GenMapInfoCB, TargetBodyGenCallbackTy BodyGenCB, TargetGenArgAccessorsCallbackTy ArgAccessorFuncCB, SmallVector< DependData > Dependencies={}, bool HasNowait=false)
 Generator for '#omp target'.
 
FunctionCallee createForStaticInitFunction (unsigned IVSize, bool IVSigned, bool IsGPUDistribute)
 Returns __kmpc_for_static_init_* runtime function for the specified size IVSize and sign IVSigned.
 
FunctionCallee createDispatchInitFunction (unsigned IVSize, bool IVSigned)
 Returns __kmpc_dispatch_init_* runtime function for the specified size IVSize and sign IVSigned.
 
FunctionCallee createDispatchNextFunction (unsigned IVSize, bool IVSigned)
 Returns __kmpc_dispatch_next_* runtime function for the specified size IVSize and sign IVSigned.
 
FunctionCallee createDispatchFiniFunction (unsigned IVSize, bool IVSigned)
 Returns __kmpc_dispatch_fini_* runtime function for the specified size IVSize and sign IVSigned.
 
FunctionCallee createDispatchDeinitFunction ()
 Returns __kmpc_dispatch_deinit runtime function.
 
InsertPointTy createAtomicRead (const LocationDescription &Loc, AtomicOpValue &X, AtomicOpValue &V, AtomicOrdering AO)
 Emit atomic Read for : V = X — Only Scalar data types.
 
InsertPointTy createAtomicWrite (const LocationDescription &Loc, AtomicOpValue &X, Value *Expr, AtomicOrdering AO)
 Emit atomic write for : X = Expr — Only Scalar data types.
 
InsertPointOrErrorTy createAtomicUpdate (const LocationDescription &Loc, InsertPointTy AllocaIP, AtomicOpValue &X, Value *Expr, AtomicOrdering AO, AtomicRMWInst::BinOp RMWOp, AtomicUpdateCallbackTy &UpdateOp, bool IsXBinopExpr)
 Emit atomic update for constructs: X = X BinOp Expr ,or X = Expr BinOp X For complex Operations: X = UpdateOp(X) => CmpExch X, old_X, UpdateOp(X) Only Scalar data types.
 
InsertPointOrErrorTy createAtomicCapture (const LocationDescription &Loc, InsertPointTy AllocaIP, AtomicOpValue &X, AtomicOpValue &V, Value *Expr, AtomicOrdering AO, AtomicRMWInst::BinOp RMWOp, AtomicUpdateCallbackTy &UpdateOp, bool UpdateExpr, bool IsPostfixUpdate, bool IsXBinopExpr)
 Emit atomic update for constructs: — Only Scalar data types V = X; X = X BinOp Expr , X = X BinOp Expr; V = X, V = X; X = Expr BinOp X, X = Expr BinOp X; V = X, V = X; X = UpdateOp(X), X = UpdateOp(X); V = X,.
 
InsertPointTy createAtomicCompare (const LocationDescription &Loc, AtomicOpValue &X, AtomicOpValue &V, AtomicOpValue &R, Value *E, Value *D, AtomicOrdering AO, omp::OMPAtomicCompareOp Op, bool IsXBinopExpr, bool IsPostfixUpdate, bool IsFailOnly)
 Emit atomic compare for constructs: — Only scalar data types cond-expr-stmt: x = x ordop expr ? expr : x; x = expr ordop x ? expr : x; x = x == e ? d : x; x = e == x ? d : x; (this one is not in the spec) cond-update-stmt: if (x ordop expr) { x = expr; } if (expr ordop x) { x = expr; } if (x == e) { x = d; } if (e == x) { x = d; } (this one is not in the spec) conditional-update-capture-atomic: v = x; cond-update-stmt; (IsPostfixUpdate=true, IsFailOnly=false) cond-update-stmt; v = x; (IsPostfixUpdate=false, IsFailOnly=false) if (x == e) { x = d; } else { v = x; } (IsPostfixUpdate=false, IsFailOnly=true) r = x == e; if (r) { x = d; } (IsPostfixUpdate=false, IsFailOnly=false) r = x == e; if (r) { x = d; } else { v = x; } (IsPostfixUpdate=false, IsFailOnly=true)
 
InsertPointTy createAtomicCompare (const LocationDescription &Loc, AtomicOpValue &X, AtomicOpValue &V, AtomicOpValue &R, Value *E, Value *D, AtomicOrdering AO, omp::OMPAtomicCompareOp Op, bool IsXBinopExpr, bool IsPostfixUpdate, bool IsFailOnly, AtomicOrdering Failure)
 
CanonicalLoopInfocreateLoopSkeleton (DebugLoc DL, Value *TripCount, Function *F, BasicBlock *PreInsertBefore, BasicBlock *PostInsertBefore, const Twine &Name={})
 Create the control flow structure of a canonical OpenMP loop.
 
void loadOffloadInfoMetadata (Module &M)
 Loads all the offload entries information from the host IR metadata.
 
void loadOffloadInfoMetadata (StringRef HostFilePath)
 Loads all the offload entries information from the host IR metadata read from the file passed in as the HostFilePath argument.
 
GlobalVariablegetOrCreateInternalVariable (Type *Ty, const StringRef &Name, unsigned AddressSpace=0)
 Gets (if variable with the given name already exist) or creates internal global variable with the specified Name.
 

Static Public Member Functions

static unsigned getOpenMPDefaultSimdAlign (const Triple &TargetTriple, const StringMap< bool > &Features)
 Get the default alignment value for given target.
 
static TargetRegionEntryInfo getTargetEntryUniqueInfo (FileIdentifierInfoCallbackTy CallBack, StringRef ParentName="")
 Creates a unique info for a target entry when provided a filename and line number from.
 
static void getKernelArgsVector (TargetKernelArgs &KernelArgs, IRBuilderBase &Builder, SmallVector< Value * > &ArgsVector)
 Create the kernel args vector used by emitTargetKernel.
 
static std::pair< int32_t, int32_t > readThreadBoundsForKernel (const Triple &T, Function &Kernel)
 }
 
static void writeThreadBoundsForKernel (const Triple &T, Function &Kernel, int32_t LB, int32_t UB)
 
static std::pair< int32_t, int32_t > readTeamBoundsForKernel (const Triple &T, Function &Kernel)
 Read/write a bounds on teams for Kernel.
 
static void writeTeamsForKernel (const Triple &T, Function &Kernel, int32_t LB, int32_t UB)
 

Public Attributes

SmallVector< FinalizationInfo, 8 > FinalizationStack
 The finalization stack made up of finalize callbacks currently in-flight, wrapped into FinalizationInfo objects that reference also the finalization target block and the kind of cancellable directive.
 
OpenMPIRBuilderConfig Config
 The OpenMPIRBuilder Configuration.
 
ModuleM
 The underlying LLVM-IR module.
 
IRBuilder Builder
 The LLVM-IR Builder used to create IR.
 
StringMap< Constant * > SrcLocStrMap
 Map to remember source location strings.
 
DenseMap< std::pair< Constant *, uint64_t >, Constant * > IdentMap
 Map to remember existing ident_t*.
 
OffloadEntriesInfoManager OffloadInfoManager
 Info manager to keep track of target regions.
 
const Triple T
 The target triple of the underlying module.
 
SmallVector< OutlineInfo, 16 > OutlineInfos
 Collection of regions that need to be outlined during finalization.
 
SmallVector< llvm::Function *, 16 > ConstantAllocaRaiseCandidates
 A collection of candidate target functions that's constant allocas will attempt to be raised on a call of finalize after all currently enqueued outline info's have been processed.
 
std::forward_list< CanonicalLoopInfoLoopInfos
 Collection of owned canonical loop objects that eventually need to be free'd.
 
StringMap< GlobalVariable *, BumpPtrAllocatorInternalVars
 An ordered map of auto-generated variables to their unique names.
 
const std::string ompOffloadInfoName = "omp_offload.info"
 OMP Offload Info Metadata name string.
 

Detailed Description

An interface to create LLVM-IR for OpenMP directives.

Each OpenMP directive has a corresponding public generator method.

Definition at line 474 of file OMPIRBuilder.h.

Member Typedef Documentation

◆ BodyGenCallbackTy

Callback type for body (=inner region) code generation.

The callback takes code locations as arguments, each describing a location where additional instructions can be inserted.

The CodeGenIP may be in the middle of a basic block or point to the end of it. The basic block may have a terminator or be degenerate. The callback function may just insert instructions at that position, but also split the block (without the Before argument of BasicBlock::splitBasicBlock such that the identify of the split predecessor block is preserved) and insert additional control flow, including branches that do not lead back to what follows the CodeGenIP. Note that since the callback is allowed to split the block, callers must assume that InsertPoints to positions in the BasicBlock after CodeGenIP including CodeGenIP itself are invalidated. If such InsertPoints need to be preserved, it can split the block itself before calling the callback.

AllocaIP and CodeGenIP must not point to the same position.

Parameters
AllocaIPis the insertion point at which new alloca instructions should be placed. The BasicBlock it is pointing to must not be split.
CodeGenIPis the insertion point at which the body code should be placed.
Returns
an error, if any were triggered during execution.

Definition at line 596 of file OMPIRBuilder.h.

◆ EmitFallbackCallbackTy

Callback function type for functions emitting the host fallback code that is executed when the kernel launch fails.

It takes an insertion point as parameter where the code should be emitted. It returns an insertion point that points right after after the emitted code.

Definition at line 2363 of file OMPIRBuilder.h.

◆ EmitMetadataErrorReportFunctionTy

Callback function type.

Definition at line 2466 of file OMPIRBuilder.h.

◆ FileIdentifierInfoCallbackTy

using llvm::OpenMPIRBuilder::FileIdentifierInfoCallbackTy = std::function<std::tuple<std::string, uint64_t>()>

Definition at line 1283 of file OMPIRBuilder.h.

◆ FinalizeCallbackTy

Callback type for variable finalization (think destructors).

Parameters
CodeGenIPis the insertion point at which the finalization code should be placed.

A finalize callback knows about all objects that need finalization, e.g. destruction, when the scope of the currently generated construct is left at the time, and location, the callback is invoked.

Definition at line 543 of file OMPIRBuilder.h.

◆ FunctionGenCallback

using llvm::OpenMPIRBuilder::FunctionGenCallback = std::function<Expected<Function *>(StringRef FunctionName)>

Functions used to generate a function with the given name.

Definition at line 2789 of file OMPIRBuilder.h.

◆ GenMapInfoCallbackTy

Callback type for creating the map infos for the kernel parameters.

Parameters
CodeGenIPis the insertion point where code should be generated, if any.

Definition at line 2857 of file OMPIRBuilder.h.

◆ InsertPointOrErrorTy

Type used to represent an insertion point or an error value.

Definition at line 523 of file OMPIRBuilder.h.

◆ InsertPointTy

Type used throughout for insertion points.

Definition at line 520 of file OMPIRBuilder.h.

◆ LoopBodyGenCallbackTy

Callback type for loop body code generation.

Parameters
CodeGenIPis the insertion point where the loop's body code must be placed. This will be a dedicated BasicBlock with a conditional branch from the loop condition check and terminated with an unconditional branch to the loop latch.
IndVaris the induction variable usable at the insertion point.
Returns
an error, if any were triggered during execution.

Definition at line 618 of file OMPIRBuilder.h.

◆ MapDeviceInfoArrayTy

Definition at line 2313 of file OMPIRBuilder.h.

◆ MapDimArrayTy

Definition at line 2316 of file OMPIRBuilder.h.

◆ MapFlagsArrayTy

Definition at line 2314 of file OMPIRBuilder.h.

◆ MapNamesArrayTy

Definition at line 2315 of file OMPIRBuilder.h.

◆ MapNonContiguousArrayTy

Definition at line 2317 of file OMPIRBuilder.h.

◆ MapValuesArrayTy

Definition at line 2312 of file OMPIRBuilder.h.

◆ PrivatizeCallbackTy

Callback type for variable privatization (think copy & default constructor).

Parameters
AllocaIPis the insertion point at which new alloca instructions should be placed.
CodeGenIPis the insertion point at which the privatization code should be placed.
OriginalThe value being copied/created, should not be used in the generated IR.
InnerThe equivalent of Original that should be used in the generated IR; this is equal to Original if the value is a pointer and can thus be passed directly, otherwise it is an equivalent but different value.
ReplValThe replacement value, thus a copy or new created version of Inner.
Returns
The new insertion point where code generation continues and ReplVal the replacement value.

Definition at line 639 of file OMPIRBuilder.h.

◆ ReductionGenAtomicCBTy

Functions used to generate atomic reductions.

Such functions take two Values representing pointers to LHS and RHS of the reduction, as well as the element type of these pointers. They are expected to atomically update the LHS to the reduced value.

Definition at line 1326 of file OMPIRBuilder.h.

◆ ReductionGenCBTy

ReductionGen CallBack for MLIR.

Parameters
CodeGenIPInsertPoint for CodeGen.
LHSPass in the LHS Value to be used for CodeGen.
RHSPass in the RHS Value to be used for CodeGen.

Definition at line 1319 of file OMPIRBuilder.h.

◆ ReductionGenClangCBTy

ReductionGen CallBack for Clang.

Parameters
CodeGenIPInsertPoint for CodeGen.
IndexIndex of the ReductionInfo to generate code for.
LHSPtrOptionally used by Clang to return the LHSPtr it used for codegen, used for fixup later.
RHSPtrOptionally used by Clang to return the RHSPtr it used for codegen, used for fixup later.
CurFnOptionally used by Clang to pass in the Current Function as Clang context may be old.

Definition at line 1310 of file OMPIRBuilder.h.

◆ StorableBodyGenCallbackTy

Returns
an error, if any were triggered during execution.

Definition at line 605 of file OMPIRBuilder.h.

◆ TargetBodyGenCallbackTy

Definition at line 2950 of file OMPIRBuilder.h.

◆ TargetGenArgAccessorsCallbackTy

Definition at line 2953 of file OMPIRBuilder.h.

◆ TargetTaskBodyCallbackTy

Callback type for generating the bodies of device directives that require outer target tasks (e.g.

in case of having nowait or depend clauses).

Parameters
DeviceIDThe ID of the device on which the target region will execute.
RTLocSource location identifier \Param TargetTaskAllocaIP Insertion point for the alloca block of the generated task.
Returns
an error, if any were triggered during execution.

Definition at line 2392 of file OMPIRBuilder.h.

Member Enumeration Documentation

◆ BodyGenTy

Type of BodyGen to use for region codegen.

Priv: If device pointer privatization is required, emit the body of the region here. It will have to be duplicated: with and without privatization. DupNoPriv: If we need device pointer privatization, we need to emit the body of the region with no privatization in the 'else' branch of the conditional. NoPriv: If we don't require privatization of device pointers, we emit the body in between the runtime calls. This avoids duplicating the body code.

Enumerator
Priv 
DupNoPriv 
NoPriv 

Definition at line 2852 of file OMPIRBuilder.h.

◆ CopyAction

Enumerator
RemoteLaneToThread 
ThreadCopy 

Definition at line 1376 of file OMPIRBuilder.h.

◆ DeviceInfoTy

Enumerator
None 
Pointer 
Address 

Definition at line 2311 of file OMPIRBuilder.h.

◆ EmitMetadataErrorKind

The kind of errors that can occur when emitting the offload entries and metadata.

Enumerator
EMIT_MD_TARGET_REGION_ERROR 
EMIT_MD_DECLARE_TARGET_ERROR 
EMIT_MD_GLOBAL_VAR_LINK_ERROR 

Definition at line 2459 of file OMPIRBuilder.h.

◆ EvalKind

Enum class for reduction evaluation types scalar, complex and aggregate.

Enumerator
Scalar 
Complex 
Aggregate 

Definition at line 1330 of file OMPIRBuilder.h.

◆ ReductionGenCBKind

Enum class for the RedctionGen CallBack type to be used.

Enumerator
Clang 
MLIR 

Definition at line 1298 of file OMPIRBuilder.h.

Constructor & Destructor Documentation

◆ OpenMPIRBuilder()

llvm::OpenMPIRBuilder::OpenMPIRBuilder ( Module M)
inline

Create a new OpenMPIRBuilder operating on the given module M.

This will not have an effect on M (see initialize)

Definition at line 478 of file OMPIRBuilder.h.

◆ ~OpenMPIRBuilder()

OpenMPIRBuilder::~OpenMPIRBuilder ( )

Definition at line 818 of file OMPIRBuilder.cpp.

References assert(), and OutlineInfos.

Member Function Documentation

◆ addAttributes()

void OpenMPIRBuilder::addAttributes ( omp::RuntimeFunction  FnID,
Function Fn 
)

◆ addOutlineInfo()

void llvm::OpenMPIRBuilder::addOutlineInfo ( OutlineInfo &&  OI)
inline

Add a new region that will be outlined later.

Definition at line 2128 of file OMPIRBuilder.h.

References OutlineInfos.

Referenced by createParallel(), createTask(), createTeams(), and emitTargetTask().

◆ applySimd()

void OpenMPIRBuilder::applySimd ( CanonicalLoopInfo Loop,
MapVector< Value *, Value * >  AlignedVars,
Value IfCond,
omp::OrderKind  Order,
ConstantInt Simdlen,
ConstantInt Safelen 
)

Add metadata to simd-ize a loop.

If IfCond is not nullptr, the loop is cloned. The metadata which prevents vectorization is added to to the cloned loop. The cloned loop is executed when ifCond is evaluated to false.

Parameters
LoopThe loop to simd-ize.
AlignedVarsThe map which containts pairs of the pointer and its corresponding alignment.
IfCondThe value which corresponds to the if clause condition.
OrderThe enum to map order clause.
SimdlenThe Simdlen length to apply to the simd loop.
SafelenThe Safelen length to apply to the simd loop.

Definition at line 5282 of file OMPIRBuilder.cpp.

References addBasicBlockMetadata(), addLoopMetadata(), addSimdMetadata(), assert(), llvm::Block, Builder, llvm::IRBuilderBase::CreateAlignmentAssumption(), F, FAM, llvm::ConstantAsMetadata::get(), llvm::MDNode::get(), llvm::MDString::get(), llvm::CanonicalLoopInfo::getCond(), llvm::IRBuilderBase::getContext(), llvm::MDNode::getDistinct(), llvm::ConstantInt::getFalse(), llvm::CanonicalLoopInfo::getFunction(), llvm::CanonicalLoopInfo::getHeader(), llvm::Type::getInt1Ty(), llvm::CanonicalLoopInfo::getLatch(), llvm::LoopInfoBase< BlockT, LoopT >::getLoopFor(), llvm::CanonicalLoopInfo::getPreheader(), llvm::BasicBlock::getTerminator(), llvm::ConstantInt::getTrue(), llvm::SmallSet< T, N, C >::insert(), llvm::ValueMap< KeyT, ValueT, Config >::lookup(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::AnalysisManager< IRUnitT, ExtraArgTs >::registerPass(), llvm::IRBuilderBase::restoreIP(), llvm::LoopAnalysis::run(), llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), and llvm::MapVector< KeyT, ValueT, MapType, VectorType >::size().

◆ applyWorkshareLoop()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::applyWorkshareLoop ( DebugLoc  DL,
CanonicalLoopInfo CLI,
InsertPointTy  AllocaIP,
bool  NeedsBarrier,
llvm::omp::ScheduleKind  SchedKind = llvm::omp::OMP_SCHEDULE_Default,
Value ChunkSize = nullptr,
bool  HasSimdModifier = false,
bool  HasMonotonicModifier = false,
bool  HasNonmonotonicModifier = false,
bool  HasOrderedClause = false,
omp::WorksharingLoopType  LoopType = omp::WorksharingLoopType::ForStaticLoop 
)

Modifies the canonical loop to be a workshare loop.

This takes a LoopInfo representing a canonical loop, such as the one created by createCanonicalLoop and emits additional instructions to turn it into a workshare loop. In particular, it calls to an OpenMP runtime function in the preheader to obtain the loop bounds to be used in the current thread, updates the relevant instructions in the canonical loop and calls to an OpenMP runtime finalization function after the loop.

The concrete transformation is done by applyStaticWorkshareLoop, applyStaticChunkedWorkshareLoop, or applyDynamicWorkshareLoop, depending on the value of SchedKind and ChunkSize.

Parameters
DLDebug location for instructions added for the workshare-loop construct itself.
CLIA descriptor of the canonical loop to workshare.
AllocaIPAn insertion point for Alloca instructions usable in the preheader of the loop.
NeedsBarrierIndicates whether a barrier must be insterted after the loop.
SchedKindScheduling algorithm to use.
ChunkSizeThe chunk size for the inner loop.
HasSimdModifierWhether the simd modifier is present in the schedule clause.
HasMonotonicModifierWhether the monotonic modifier is present in the schedule clause.
HasNonmonotonicModifierWhether the nonmonotonic modifier is present in the schedule clause.
HasOrderedClauseWhether the (parameterless) ordered clause is present.
LoopTypeInformation about type of loop worksharing. It corresponds to type of loop workshare OpenMP pragma.
Returns
Point where to insert code after the workshare construct.

Definition at line 4556 of file OMPIRBuilder.cpp.

References assert(), computeOpenMPScheduleType(), Config, DL, llvm::OpenMPIRBuilderConfig::isTargetDevice(), and llvm_unreachable.

◆ collapseLoops()

CanonicalLoopInfo * OpenMPIRBuilder::collapseLoops ( DebugLoc  DL,
ArrayRef< CanonicalLoopInfo * >  Loops,
InsertPointTy  ComputeIP 
)

Collapse a loop nest into a single loop.

Merges loops of a loop nest into a single CanonicalLoopNest representation that has the same number of innermost loop iterations as the origin loop nest. The induction variables of the input loops are derived from the collapsed loop's induction variable. This is intended to be used to implement OpenMP's collapse clause. Before applying a directive, collapseLoops normalizes a loop nest to contain only a single loop and the directive's implementation does not need to handle multiple loops itself. This does not remove the need to handle all loop nest handling by directives, such as the ordered(<n>) clause or the simd schedule-clause modifier of the worksharing-loop directive.

Example:

for (int i = 0; i < 7; ++i) // Canonical loop "i"
for (int j = 0; j < 9; ++j) // Canonical loop "j"
body(i, j);

After collapsing with Loops={i,j}, the loop is changed to

for (int ij = 0; ij < 63; ++ij) {
int i = ij / 9;
int j = ij % 9;
body(i, j);
}

In the current implementation, the following limitations apply:

  • All input loops have an induction variable of the same type.
  • The collapsed loop will have the same trip count integer type as the input loops. Therefore it is possible that the collapsed loop cannot represent all iterations of the input loops. For instance, assuming a 32 bit integer type, and two input loops both iterating 2^16 times, the theoretical trip count of the collapsed loop would be 2^32 iteration, which cannot be represented in an 32-bit integer. Behavior is undefined in this case.
  • The trip counts of every input loop must be available at ComputeIP. Non-rectangular loops are not yet supported.
  • At each nest level, code between a surrounding loop and its nested loop is hoisted into the loop body, and such code will be executed more often than before collapsing (or not at all if any inner loop iteration has a trip count of 0). This is permitted by the OpenMP specification.
Parameters
DLDebug location for instructions added for collapsing, such as instructions to compute/derive the input loop's induction variables.
LoopsLoops in the loop nest to collapse. Loops are specified from outermost-to-innermost and every control flow of a loop's body must pass through its directly nested loop.
ComputeIPWhere additional instruction that compute the collapsed trip count. If not set, defaults to before the generated loop.
Returns
The CanonicalLoopInfo object representing the collapsed loop.

Definition at line 4822 of file OMPIRBuilder.cpp.

References assert(), Builder, createLoopSkeleton(), llvm::IRBuilderBase::CreateMul(), llvm::IRBuilderBase::CreateUDiv(), llvm::IRBuilderBase::CreateURem(), DL, F, llvm::CanonicalLoopInfo::getAfter(), llvm::CanonicalLoopInfo::getBody(), llvm::CanonicalLoopInfo::getLatch(), llvm::ilist_node_with_parent< NodeTy, ParentTy, Options >::getNextNode(), llvm::BasicBlock::getParent(), llvm::CanonicalLoopInfo::getPreheader(), llvm::CanonicalLoopInfo::getPreheaderIP(), llvm::IRBuilderBase::InsertPoint::isSet(), Loops, redirectAllPredecessorsTo(), redirectTo(), removeUnusedBlocksFromParent(), llvm::SmallVectorImpl< T >::reserve(), llvm::SmallVectorImpl< T >::resize(), llvm::IRBuilderBase::restoreIP(), and llvm::IRBuilderBase::SetCurrentDebugLocation().

◆ createAtomicCapture()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createAtomicCapture ( const LocationDescription Loc,
InsertPointTy  AllocaIP,
AtomicOpValue X,
AtomicOpValue V,
Value Expr,
AtomicOrdering  AO,
AtomicRMWInst::BinOp  RMWOp,
AtomicUpdateCallbackTy UpdateOp,
bool  UpdateExpr,
bool  IsPostfixUpdate,
bool  IsXBinopExpr 
)

Emit atomic update for constructs: — Only Scalar data types V = X; X = X BinOp Expr , X = X BinOp Expr; V = X, V = X; X = Expr BinOp X, X = Expr BinOp X; V = X, V = X; X = UpdateOp(X), X = UpdateOp(X); V = X,.

Parameters
LocThe insert and source location description.
AllocaIPThe insertion point to be used for alloca instructions.
XThe target atomic pointer to be updated
VMemory address where to store captured value
ExprThe value to update X with.
AOAtomic ordering of the generated atomic instructions
RMWOpThe binary operation used for update. If operation is not supported by atomicRMW, or belong to {FADD, FSUB, BAD_BINOP}. Then a cmpExch based atomic will be generated.
UpdateOpCode generator for complex expressions that cannot be expressed through atomicrmw instruction.
UpdateExprtrue if X is an in place update of the form X = X BinOp Expr or X = Expr BinOp X
IsXBinopExprtrue if X is Left H.S. in Right H.S. part of the update expression, false otherwise. (e.g. true for X = X BinOp Expr)
IsPostfixUpdatetrue if original value of 'x' must be stored in 'v', not an updated one.
Returns
Insertion point after generated atomic capture IR.

Definition at line 8642 of file OMPIRBuilder.cpp.

References assert(), Builder, llvm::IRBuilderBase::CreateStore(), llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::Type::isFloatingPointTy(), llvm::Type::isIntegerTy(), llvm::Type::isPointerTy(), LLVM_DEBUG, llvm::AtomicRMWInst::Max, llvm::AtomicRMWInst::Min, llvm::IRBuilderBase::saveIP(), llvm::Expected< T >::takeError(), updateToLocation(), X, and llvm::AtomicRMWInst::Xchg.

◆ createAtomicCompare() [1/2]

OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicCompare ( const LocationDescription Loc,
AtomicOpValue X,
AtomicOpValue V,
AtomicOpValue R,
Value E,
Value D,
AtomicOrdering  AO,
omp::OMPAtomicCompareOp  Op,
bool  IsXBinopExpr,
bool  IsPostfixUpdate,
bool  IsFailOnly 
)

Emit atomic compare for constructs: — Only scalar data types cond-expr-stmt: x = x ordop expr ? expr : x; x = expr ordop x ? expr : x; x = x == e ? d : x; x = e == x ? d : x; (this one is not in the spec) cond-update-stmt: if (x ordop expr) { x = expr; } if (expr ordop x) { x = expr; } if (x == e) { x = d; } if (e == x) { x = d; } (this one is not in the spec) conditional-update-capture-atomic: v = x; cond-update-stmt; (IsPostfixUpdate=true, IsFailOnly=false) cond-update-stmt; v = x; (IsPostfixUpdate=false, IsFailOnly=false) if (x == e) { x = d; } else { v = x; } (IsPostfixUpdate=false, IsFailOnly=true) r = x == e; if (r) { x = d; } (IsPostfixUpdate=false, IsFailOnly=false) r = x == e; if (r) { x = d; } else { v = x; } (IsPostfixUpdate=false, IsFailOnly=true)

Parameters
LocThe insert and source location description.
XThe target atomic pointer to be updated.
VMemory address where to store captured value (for compare capture only).
RMemory address where to store comparison result (for compare capture with '==' only).
EThe expected value ('e') for forms that use an equality comparison or an expression ('expr') for forms that use 'ordop' (logically an atomic maximum or minimum).
DThe desired value for forms that use an equality comparison. If forms that use 'ordop', it should be nullptr.
AOAtomic ordering of the generated atomic instructions.
OpAtomic compare operation. It can only be ==, <, or >.
IsXBinopExprTrue if the conditional statement is in the form where x is on LHS. It only matters for < or >.
IsPostfixUpdateTrue if original value of 'x' must be stored in 'v', not an updated one (for compare capture only).
IsFailOnlyTrue if the original value of 'x' is stored to 'v' only when the comparison fails. This is only valid for the case the comparison is '=='.
Returns
Insertion point after generated atomic capture IR.

Definition at line 8678 of file OMPIRBuilder.cpp.

References createAtomicCompare(), D, llvm::AtomicCmpXchgInst::getStrongestFailureOrdering(), and X.

Referenced by createAtomicCompare().

◆ createAtomicCompare() [2/2]

OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicCompare ( const LocationDescription Loc,
AtomicOpValue X,
AtomicOpValue V,
AtomicOpValue R,
Value E,
Value D,
AtomicOrdering  AO,
omp::OMPAtomicCompareOp  Op,
bool  IsXBinopExpr,
bool  IsPostfixUpdate,
bool  IsFailOnly,
AtomicOrdering  Failure 
)

◆ createAtomicRead()

OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicRead ( const LocationDescription Loc,
AtomicOpValue X,
AtomicOpValue V,
AtomicOrdering  AO 
)

◆ createAtomicUpdate()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createAtomicUpdate ( const LocationDescription Loc,
InsertPointTy  AllocaIP,
AtomicOpValue X,
Value Expr,
AtomicOrdering  AO,
AtomicRMWInst::BinOp  RMWOp,
AtomicUpdateCallbackTy UpdateOp,
bool  IsXBinopExpr 
)

Emit atomic update for constructs: X = X BinOp Expr ,or X = Expr BinOp X For complex Operations: X = UpdateOp(X) => CmpExch X, old_X, UpdateOp(X) Only Scalar data types.

Parameters
LocThe insert and source location description.
AllocaIPThe insertion point to be used for alloca instructions.
XThe target atomic pointer to be updated
ExprThe value to update X with.
AOAtomic ordering of the generated atomic instructions.
RMWOpThe binary operation used for update. If operation is not supported by atomicRMW, or belong to {FADD, FSUB, BAD_BINOP}. Then a cmpExch based atomic will be generated.
UpdateOpCode generator for complex expressions that cannot be expressed through atomicrmw instruction.
IsXBinopExprtrue if X is Left H.S. in Right H.S. part of the update expression, false otherwise. (e.g. true for X = X BinOp Expr)
Returns
Insertion point after generated atomic update IR.

Definition at line 8423 of file OMPIRBuilder.cpp.

References assert(), Builder, llvm::OpenMPIRBuilder::LocationDescription::IP, isConflictIP(), llvm::Type::isFloatingPointTy(), llvm::Type::isIntegerTy(), llvm::Type::isPointerTy(), LLVM_DEBUG, llvm::AtomicRMWInst::Max, llvm::AtomicRMWInst::Min, llvm::IRBuilderBase::saveIP(), llvm::AtomicRMWInst::UMax, llvm::AtomicRMWInst::UMin, updateToLocation(), and X.

◆ createAtomicWrite()

OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createAtomicWrite ( const LocationDescription Loc,
AtomicOpValue X,
Value Expr,
AtomicOrdering  AO 
)

Emit atomic write for : X = Expr — Only Scalar data types.

Parameters
LocThe insert and source location description.
XThe target pointer to be atomically written to
ExprThe value to store.
AOAtomic ordering of the generated atomic instructions.
Returns
Insertion point after generated atomic Write IR.

Definition at line 8393 of file OMPIRBuilder.cpp.

References assert(), Builder, llvm::IRBuilderBase::CreateBitCast(), llvm::IRBuilderBase::CreateStore(), llvm::IntegerType::get(), llvm::Module::getContext(), llvm::Type::getScalarSizeInBits(), llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::Type::isFloatingPointTy(), llvm::Type::isIntegerTy(), llvm::Type::isPointerTy(), M, llvm::IRBuilderBase::saveIP(), llvm::StoreInst::setAtomic(), updateToLocation(), and X.

◆ createBarrier()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createBarrier ( const LocationDescription Loc,
omp::Directive  Kind,
bool  ForceSimpleCall = false,
bool  CheckCancelFlag = true 
)

Emitter methods for OpenMP directives.

{ Generator for '#omp barrier'

Parameters
LocThe location where the barrier directive was encountered.
KindThe kind of directive that caused the barrier.
ForceSimpleCallFlag to force a simple (=non-cancellation) barrier.
CheckCancelFlagFlag to indicate a cancel barrier return value should be checked and acted upon.
ThreadIDOptional parameter to pass in any existing ThreadID value.
Returns
The insertion point after the barrier.

Definition at line 948 of file OMPIRBuilder.cpp.

References Builder, llvm::IRBuilderBase::CreateCall(), emitCancelationCheckImpl(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::OpenMPIRBuilder::LocationDescription::IP, isLastFinalizationInfoCancellable(), llvm::IRBuilderBase::saveIP(), and updateToLocation().

Referenced by createCancel(), and createSingle().

◆ createCachedThreadPrivate()

CallInst * OpenMPIRBuilder::createCachedThreadPrivate ( const LocationDescription Loc,
llvm::Value Pointer,
llvm::ConstantInt Size,
const llvm::Twine Name = Twine("") 
)

Create a runtime call for kmpc_threadprivate_cached.

Parameters
LocThe insert and source location description.
Pointerpointer to data to be cached
Sizesize of data to be cached
NameName of call Instruction for callinst
Returns
CallInst to the thread private cache call.

Definition at line 6110 of file OMPIRBuilder.cpp.

References Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateInternalVariable(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), Name, Pointer, Size, and updateToLocation().

◆ createCancel()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createCancel ( const LocationDescription Loc,
Value IfCondition,
omp::Directive  CanceledDirective 
)

◆ createCanonicalLoop() [1/2]

Expected< CanonicalLoopInfo * > OpenMPIRBuilder::createCanonicalLoop ( const LocationDescription Loc,
LoopBodyGenCallbackTy  BodyGenCB,
Value Start,
Value Stop,
Value Step,
bool  IsSigned,
bool  InclusiveStop,
InsertPointTy  ComputeIP = {},
const Twine Name = "loop" 
)

Generator for the control flow structure of an OpenMP canonical loop.

Instead of a logical iteration space, this allows specifying user-defined loop counter values using increment, upper- and lower bounds. To disambiguate the terminology when counting downwards, instead of lower bounds we use Start for the loop counter value in the first body iteration.

Consider the following limitations:

  • A loop counter space over all integer values of its bit-width cannot be represented. E.g using uint8_t, its loop trip count of 256 cannot be stored into an 8 bit integer):

    DO I = 0, 255, 1

  • Unsigned wrapping is only supported when wrapping only "once"; E.g. effectively counting downwards:

    for (uint8_t i = 100u; i > 0; i += 127u)

TODO: May need to add additional parameters to represent:

  • Allow representing downcounting with unsigned integers.
  • Sign of the step and the comparison operator might disagree:

    for (int i = 0; i < 42; i -= 1u)

    Parameters
    LocThe insert and source location description.
    BodyGenCBCallback that will generate the loop body code.
    StartValue of the loop counter for the first iterations.
    StopLoop counter values past this will stop the loop.
    StepLoop counter increment after each iteration; negative means counting down.
    IsSignedWhether Start, Stop and Step are signed integers.
    InclusiveStopWhether Stop itself is a valid value for the loop counter.
    ComputeIPInsertion point for instructions computing the trip count. Can be used to ensure the trip count is available at the outermost loop of a loop nest. If not set, defaults to the preheader of the generated loop.
    NameBase name used to derive BB and instruction names.
    Returns
    An object representing the created control flow structure which can be used for loop-associated directives.
  • A Step of INT_MIN cannot not be normalized to a positive direction:

Definition at line 3986 of file OMPIRBuilder.cpp.

References assert(), Builder, llvm::IRBuilderBase::CreateAdd(), createCanonicalLoop(), llvm::IRBuilderBase::CreateICmp(), llvm::IRBuilderBase::CreateICmpSLT(), llvm::IRBuilderBase::CreateMul(), llvm::IRBuilderBase::CreateNeg(), llvm::IRBuilderBase::CreateSelect(), llvm::IRBuilderBase::CreateSub(), llvm::IRBuilderBase::CreateUDiv(), llvm::OpenMPIRBuilder::LocationDescription::DL, llvm::Value::getType(), llvm::CmpInst::ICMP_SLE, llvm::CmpInst::ICMP_SLT, llvm::CmpInst::ICMP_ULE, llvm::CmpInst::ICMP_ULT, llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::IRBuilderBase::InsertPoint::isSet(), IV, Name, llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), and updateToLocation().

◆ createCanonicalLoop() [2/2]

Expected< CanonicalLoopInfo * > OpenMPIRBuilder::createCanonicalLoop ( const LocationDescription Loc,
LoopBodyGenCallbackTy  BodyGenCB,
Value TripCount,
const Twine Name = "loop" 
)

Generator for the control flow structure of an OpenMP canonical loop.

This generator operates on the logical iteration space of the loop, i.e. the caller only has to provide a loop trip count of the loop as defined by base language semantics. The trip count is interpreted as an unsigned integer. The induction variable passed to BodyGenCB will be of the same type and run from 0 to TripCount - 1. It is up to the callback to convert the logical iteration variable to the loop counter variable in the loop body.

Parameters
LocThe insert and source location description. The insert location can be between two instructions or the end of a degenerate block (e.g. a BB under construction).
BodyGenCBCallback that will generate the loop body code.
TripCountNumber of iterations the loop body is executed.
NameBase name used to derive BB and instruction names.
Returns
An object representing the created control flow structure which can be used for loop-associated directives.

Definition at line 3956 of file OMPIRBuilder.cpp.

References After, llvm::CanonicalLoopInfo::assertOK(), Builder, llvm::IRBuilderBase::CreateBr(), createLoopSkeleton(), llvm::OpenMPIRBuilder::LocationDescription::DL, llvm::CanonicalLoopInfo::getAfter(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::CanonicalLoopInfo::getBodyIP(), llvm::CanonicalLoopInfo::getIndVar(), llvm::ilist_node_with_parent< NodeTy, ParentTy, Options >::getNextNode(), llvm::BasicBlock::getParent(), llvm::CanonicalLoopInfo::getPreheader(), llvm::OpenMPIRBuilder::LocationDescription::IP, Name, llvm::spliceBB(), and updateToLocation().

Referenced by createCanonicalLoop(), and createSections().

◆ createCopyinClauseBlocks()

OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createCopyinClauseBlocks ( InsertPointTy  IP,
Value MasterAddr,
Value PrivateAddr,
llvm::IntegerType IntPtrTy,
bool  BranchtoEnd = true 
)

Generate conditional branch and relevant BasicBlocks through which private threads copy the 'copyin' variables from Master copy to threadprivate copies.

Parameters
IPinsertion block for copyin conditional
MasterVarPtra pointer to the master variable
PrivateVarPtra pointer to the threadprivate variable
IntPtrTyPointer size type
BranchtoEndCreate a branch between the copyin.not.master blocks
Returns
The insertion point where copying operation to be emitted.

Definition at line 5944 of file OMPIRBuilder.cpp.

References Builder, llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateBr(), llvm::IRBuilderBase::CreateCondBr(), llvm::IRBuilderBase::CreateICmpNE(), llvm::IRBuilderBase::CreatePtrToInt(), llvm::Instruction::eraseFromParent(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::Module::getContext(), llvm::BasicBlock::getParent(), llvm::BasicBlock::getTerminator(), llvm::IRBuilderBase::InsertPoint::isSet(), M, llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), and llvm::BasicBlock::splitBasicBlock().

◆ createCopyPrivate()

OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createCopyPrivate ( const LocationDescription Loc,
llvm::Value BufSize,
llvm::Value CpyBuf,
llvm::Value CpyFn,
llvm::Value DidIt 
)

Generator for __kmpc_copyprivate.

Parameters
LocThe source location description.
BufSizeNumber of elements in the buffer.
CpyBufList of pointers to data to be copied.
CpyFnfunction to call for copying data.
DidItflag variable; 1 for 'single' thread, 0 otherwise.
Returns
The insertion position after the CopyPrivate call.

Definition at line 5620 of file OMPIRBuilder.cpp.

References Builder, llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateLoad(), llvm::IRBuilderBase::getInt32Ty(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::IRBuilderBase::saveIP(), and updateToLocation().

Referenced by createSingle().

◆ createCritical()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createCritical ( const LocationDescription Loc,
BodyGenCallbackTy  BodyGenCB,
FinalizeCallbackTy  FiniCB,
StringRef  CriticalName,
Value HintInst 
)

Generator for '#omp critical'.

Parameters
LocThe insert and source location description.
BodyGenCBCallback that will generate the region body code.
FiniCBCallback to finalize variable copies.
CriticalNamename of the lock used by the critical directive
HintInstHint Instruction for hint clause associated with critical
Returns
The insertion position after the critical.

Definition at line 5716 of file OMPIRBuilder.cpp.

References Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::SmallVectorTemplateBase< T, bool >::push_back(), and updateToLocation().

◆ createDispatchDeinitFunction()

FunctionCallee OpenMPIRBuilder::createDispatchDeinitFunction ( )

Returns __kmpc_dispatch_deinit runtime function.

Definition at line 6740 of file OMPIRBuilder.cpp.

References getOrCreateRuntimeFunction(), and M.

◆ createDispatchFiniFunction()

FunctionCallee OpenMPIRBuilder::createDispatchFiniFunction ( unsigned  IVSize,
bool  IVSigned 
)

Returns __kmpc_dispatch_fini_* runtime function for the specified size IVSize and sign IVSigned.

Definition at line 6727 of file OMPIRBuilder.cpp.

References assert(), getOrCreateRuntimeFunction(), M, and Name.

◆ createDispatchInitFunction()

FunctionCallee OpenMPIRBuilder::createDispatchInitFunction ( unsigned  IVSize,
bool  IVSigned 
)

Returns __kmpc_dispatch_init_* runtime function for the specified size IVSize and sign IVSigned.

Definition at line 6701 of file OMPIRBuilder.cpp.

References assert(), getOrCreateRuntimeFunction(), M, and Name.

◆ createDispatchNextFunction()

FunctionCallee OpenMPIRBuilder::createDispatchNextFunction ( unsigned  IVSize,
bool  IVSigned 
)

Returns __kmpc_dispatch_next_* runtime function for the specified size IVSize and sign IVSigned.

Definition at line 6714 of file OMPIRBuilder.cpp.

References assert(), getOrCreateRuntimeFunction(), M, and Name.

◆ createFlush()

void OpenMPIRBuilder::createFlush ( const LocationDescription Loc)

Generator for '#omp flush'.

Parameters
LocThe location where the flush directive was encountered

Definition at line 1707 of file OMPIRBuilder.cpp.

References emitFlush(), and updateToLocation().

◆ createForStaticInitFunction()

FunctionCallee OpenMPIRBuilder::createForStaticInitFunction ( unsigned  IVSize,
bool  IVSigned,
bool  IsGPUDistribute 
)

Returns __kmpc_for_static_init_* runtime function for the specified size IVSize and sign IVSigned.

Will create a distribute call __kmpc_distribute_static_init* if IsGPUDistribute is set.

Definition at line 6681 of file OMPIRBuilder.cpp.

References assert(), getOrCreateRuntimeFunction(), M, and Name.

◆ createGlobalFlag()

GlobalValue * OpenMPIRBuilder::createGlobalFlag ( unsigned  Value,
StringRef  Name 
)

Create a hidden global flag Name in the module with initial value Value.

Definition at line 822 of file OMPIRBuilder.cpp.

References llvm::Module::getContext(), llvm::Type::getInt32Ty(), llvm::GlobalValue::HiddenVisibility, M, Name, and llvm::GlobalValue::WeakODRLinkage.

◆ createLoopSkeleton()

CanonicalLoopInfo * OpenMPIRBuilder::createLoopSkeleton ( DebugLoc  DL,
Value TripCount,
Function F,
BasicBlock PreInsertBefore,
BasicBlock PostInsertBefore,
const Twine Name = {} 
)

Create the control flow structure of a canonical OpenMP loop.

The emitted loop will be disconnected, i.e. no edge to the loop's preheader and no terminator in the AfterBB. The OpenMPIRBuilder's IRBuilder location is not preserved.

Parameters
DLDebugLoc used for the instructions in the skeleton.
TripCountValue to be used for the trip count.
FFunction in which to insert the BasicBlocks.
PreInsertBeforeWhere to insert BBs that execute before the body, typically the body itself.
PostInsertBeforeWhere to insert BBs that execute after the body.
NameBase name used to derive BB and instruction names.
Returns
The CanonicalLoopInfo that represents the emitted loop.

Definition at line 3889 of file OMPIRBuilder.cpp.

References llvm::PHINode::addIncoming(), After, llvm::CanonicalLoopInfo::assertOK(), Builder, Cond, llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateAdd(), llvm::IRBuilderBase::CreateBr(), llvm::IRBuilderBase::CreateCondBr(), llvm::IRBuilderBase::CreateICmpULT(), llvm::IRBuilderBase::CreatePHI(), DL, F, llvm::Module::getContext(), llvm::Value::getType(), LoopInfos, M, Name, llvm::IRBuilderBase::SetCurrentDebugLocation(), and llvm::IRBuilderBase::SetInsertPoint().

Referenced by collapseLoops(), createCanonicalLoop(), and tileLoops().

◆ createMapperAllocas()

void OpenMPIRBuilder::createMapperAllocas ( const LocationDescription Loc,
InsertPointTy  AllocaIP,
unsigned  NumOperands,
struct MapperAllocas MapperAllocas 
)

◆ createMasked()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createMasked ( const LocationDescription Loc,
BodyGenCallbackTy  BodyGenCB,
FinalizeCallbackTy  FiniCB,
Value Filter 
)

Generator for '#omp masked'.

Parameters
LocThe insert and source location description.
BodyGenCBCallback that will generate the region code.
FiniCBCallback to finialize variable copies.
Returns
The insertion position after the masked.

Definition at line 3865 of file OMPIRBuilder.cpp.

References Builder, llvm::IRBuilderBase::CreateCall(), llvm::Filter, getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::OpenMPIRBuilder::LocationDescription::IP, and updateToLocation().

◆ createMaster()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createMaster ( const LocationDescription Loc,
BodyGenCallbackTy  BodyGenCB,
FinalizeCallbackTy  FiniCB 
)

Generator for '#omp master'.

Parameters
LocThe insert and source location description.
BodyGenCBCallback that will generate the region code.
FiniCBCallback to finalize variable copies.
Returns
The insertion position after the master.

Definition at line 3841 of file OMPIRBuilder.cpp.

References Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::OpenMPIRBuilder::LocationDescription::IP, and updateToLocation().

◆ createOffloadEntriesAndInfoMetadata()

void OpenMPIRBuilder::createOffloadEntriesAndInfoMetadata ( EmitMetadataErrorReportFunctionTy ErrorReportFunction)

◆ createOffloadEntry()

void OpenMPIRBuilder::createOffloadEntry ( Constant ID,
Constant Addr,
uint64_t  Size,
int32_t  Flags,
GlobalValue::LinkageTypes  ,
StringRef  Name = "" 
)

◆ createOffloadMapnames()

GlobalVariable * OpenMPIRBuilder::createOffloadMapnames ( SmallVectorImpl< llvm::Constant * > &  Names,
std::string  VarName 
)

◆ createOffloadMaptypes()

GlobalVariable * OpenMPIRBuilder::createOffloadMaptypes ( SmallVectorImpl< uint64_t > &  Mappings,
std::string  VarName 
)

Create the global variable holding the offload mappings information.

Definition at line 7538 of file OMPIRBuilder.cpp.

References llvm::ConstantDataArray::get(), llvm::Module::getContext(), llvm::Value::getType(), llvm::GlobalValue::Global, M, and llvm::GlobalValue::PrivateLinkage.

Referenced by emitOffloadingArrays().

◆ createOMPAlloc()

CallInst * OpenMPIRBuilder::createOMPAlloc ( const LocationDescription Loc,
Value Size,
Value Allocator,
std::string  Name = "" 
)

Create a runtime call for kmpc_Alloc.

Parameters
LocThe insert and source location description.
SizeSize of allocated memory space
AllocatorAllocator information instruction
NameName of call Instruction for OMP_alloc
Returns
CallInst to the OMP_Alloc call

Definition at line 5994 of file OMPIRBuilder.cpp.

References Allocator, Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), Name, Size, and updateToLocation().

◆ createOMPFree()

CallInst * OpenMPIRBuilder::createOMPFree ( const LocationDescription Loc,
Value Addr,
Value Allocator,
std::string  Name = "" 
)

Create a runtime call for kmpc_free.

Parameters
LocThe insert and source location description.
AddrAddress of memory space to be freed
AllocatorAllocator information instruction
NameName of call Instruction for OMP_Free
Returns
CallInst to the OMP_Free call

Definition at line 6011 of file OMPIRBuilder.cpp.

References Addr, Allocator, Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), Name, and updateToLocation().

◆ createOMPInteropDestroy()

CallInst * OpenMPIRBuilder::createOMPInteropDestroy ( const LocationDescription Loc,
Value InteropVar,
Value Device,
Value NumDependences,
Value DependenceAddress,
bool  HaveNowaitClause 
)

Create a runtime call for __tgt_interop_destroy.

Parameters
LocThe insert and source location description.
InteropVarvariable to be allocated
Devicedevide to which offloading will occur
NumDependencesnumber of dependence variables
DependenceAddresspointer to dependence variables
HaveNowaitClausedoes nowait clause exist
Returns
CallInst to the __tgt_interop_destroy call

Definition at line 6055 of file OMPIRBuilder.cpp.

References Builder, llvm::IRBuilderBase::CreateCall(), llvm::ConstantPointerNull::get(), llvm::Constant::getAllOnesValue(), llvm::Module::getContext(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::PointerType::getUnqual(), Int32, M, and updateToLocation().

◆ createOMPInteropInit()

CallInst * OpenMPIRBuilder::createOMPInteropInit ( const LocationDescription Loc,
Value InteropVar,
omp::OMPInteropType  InteropType,
Value Device,
Value NumDependences,
Value DependenceAddress,
bool  HaveNowaitClause 
)

Create a runtime call for __tgt_interop_init.

Parameters
LocThe insert and source location description.
InteropVarvariable to be allocated
InteropTypetype of interop operation
Devicedevide to which offloading will occur
NumDependencesnumber of dependence variables
DependenceAddresspointer to dependence variables
HaveNowaitClausedoes nowait clause exist
Returns
CallInst to the __tgt_interop_init call

Definition at line 6026 of file OMPIRBuilder.cpp.

References Builder, llvm::IRBuilderBase::CreateCall(), llvm::ConstantPointerNull::get(), llvm::Constant::getAllOnesValue(), llvm::Module::getContext(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::PointerType::getUnqual(), Int32, M, and updateToLocation().

◆ createOMPInteropUse()

CallInst * OpenMPIRBuilder::createOMPInteropUse ( const LocationDescription Loc,
Value InteropVar,
Value Device,
Value NumDependences,
Value DependenceAddress,
bool  HaveNowaitClause 
)

Create a runtime call for __tgt_interop_use.

Parameters
LocThe insert and source location description.
InteropVarvariable to be allocated
Devicedevide to which offloading will occur
NumDependencesnumber of dependence variables
DependenceAddresspointer to dependence variables
HaveNowaitClausedoes nowait clause exist
Returns
CallInst to the __tgt_interop_use call

Definition at line 6082 of file OMPIRBuilder.cpp.

References Builder, llvm::IRBuilderBase::CreateCall(), llvm::ConstantPointerNull::get(), llvm::Constant::getAllOnesValue(), llvm::Module::getContext(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::PointerType::getUnqual(), Int32, M, and updateToLocation().

◆ createOrderedDepend()

OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createOrderedDepend ( const LocationDescription Loc,
InsertPointTy  AllocaIP,
unsigned  NumLoops,
ArrayRef< llvm::Value * >  StoreValues,
const Twine Name,
bool  IsDependSource 
)

Generator for '#omp ordered depend (source | sink)'.

Parameters
LocThe insert and source location description.
AllocaIPThe insertion point to be used for alloca instructions.
NumLoopsThe number of loops in depend clause.
StoreValuesThe value will be stored in vector address.
NameThe name of alloca instruction.
IsDependSourceIf true, depend source; otherwise, depend sink.
Returns
The insertion position after the ordered.

Definition at line 5751 of file OMPIRBuilder.cpp.

References llvm::all_of(), assert(), Builder, llvm::IRBuilderBase::CreateAlloca(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateInBoundsGEP(), llvm::IRBuilderBase::CreateStore(), llvm::ArrayType::get(), llvm::IRBuilderBase::getInt64(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::Value::getType(), I, Int64, llvm::OpenMPIRBuilder::LocationDescription::IP, Name, llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), llvm::AllocaInst::setAlignment(), llvm::StoreInst::setAlignment(), and updateToLocation().

◆ createOrderedThreadsSimd()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createOrderedThreadsSimd ( const LocationDescription Loc,
BodyGenCallbackTy  BodyGenCB,
FinalizeCallbackTy  FiniCB,
bool  IsThreads 
)

Generator for '#omp ordered [threads | simd]'.

Parameters
LocThe insert and source location description.
BodyGenCBCallback that will generate the region code.
FiniCBCallback to finalize variable copies.
IsThreadsIf true, with threads clause or without clause; otherwise, with simd clause;
Returns
The insertion position after the ordered.

Definition at line 5797 of file OMPIRBuilder.cpp.

References Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::OpenMPIRBuilder::LocationDescription::IP, and updateToLocation().

◆ createParallel()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createParallel ( const LocationDescription Loc,
InsertPointTy  AllocaIP,
BodyGenCallbackTy  BodyGenCB,
PrivatizeCallbackTy  PrivCB,
FinalizeCallbackTy  FiniCB,
Value IfCondition,
Value NumThreads,
omp::ProcBindKind  ProcBind,
bool  IsCancellable 
)

Generator for '#omp parallel'.

Parameters
LocThe insert and source location description.
AllocaIPThe insertion points to be used for alloca instructions.
BodyGenCBCallback that will generate the region code.
PrivCBCallback to copy a given variable (think copy constructor).
FiniCBCallback to finalize variable copies.
IfConditionThe evaluated 'if' clause expression, if any.
NumThreadsThe evaluated 'num_threads' clause expression, if any.
ProcBindThe value of the 'proc_bind' clause (see ProcBindKind).
IsCancellableFlag to indicate a cancellable parallel region.
Returns
The insertion position after the parallel.

Definition at line 1368 of file OMPIRBuilder.cpp.

References addOutlineInfo(), assert(), llvm::BasicBlock::begin(), Blocks, Builder, llvm::OpenMPIRBuilder::OutlineInfo::collectBlocks(), Config, llvm::SmallPtrSetImpl< PtrType >::count(), llvm::IRBuilderBase::CreateAlloca(), llvm::IRBuilderBase::CreateBr(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateIntCast(), llvm::IRBuilderBase::CreateLoad(), llvm::IRBuilderBase::CreateStore(), llvm::dbgs(), llvm::SetVector< T, Vector, Set, N >::empty(), llvm::OpenMPIRBuilder::OutlineInfo::EntryBB, llvm::OpenMPIRBuilder::OutlineInfo::ExcludeArgsFromAggregate, llvm::OpenMPIRBuilder::OutlineInfo::ExitBB, FinalizationStack, llvm::CodeExtractor::findAllocas(), llvm::CodeExtractor::findInputsOutputs(), llvm::DataLayout::getAllocaAddrSpace(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::CallBase::getCalledFunction(), llvm::FunctionCallee::getCallee(), llvm::IRBuilderBase::getContext(), llvm::Module::getContext(), llvm::Module::getDataLayout(), llvm::BasicBlock::getFirstInsertionPt(), llvm::IRBuilderBase::GetInsertBlock(), llvm::ilist_node_impl< OptionsT >::getIterator(), llvm::ilist_node_with_parent< NodeTy, ParentTy, Options >::getNextNode(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::ilist_detail::node_parent_access< NodeTy, ParentTy >::getParent(), llvm::BasicBlock::getParent(), llvm::BasicBlock::getTerminator(), hostParallelCallback(), I, llvm::Instruction::insertAfter(), Int32, llvm::OpenMPIRBuilder::LocationDescription::IP, isConflictIP(), llvm::OpenMPIRBuilderConfig::isTargetDevice(), LLVM_DEBUG, M, llvm::OpenMPIRBuilder::OutlineInfo::OuterAllocaBB, llvm::OpenMPIRBuilder::OutlineInfo::PostOutlineCB, Ptr, llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::SetVector< T, Vector, Set, N >::remove_if(), llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), llvm::Value::setName(), llvm::BasicBlock::splitBasicBlock(), llvm::SplitBlock(), llvm::Error::success(), llvm::Expected< T >::takeError(), targetParallelCallback(), updateToLocation(), and Uses.

◆ createPlatformSpecificName()

std::string OpenMPIRBuilder::createPlatformSpecificName ( ArrayRef< StringRef Parts) const

Get the create a name using the platform specific separators.

Parameters
Partsparts of the final name that needs separation The created name has a first separator between the first and second part and a second separator between all other parts. E.g. with FirstSeparator "$" and Separator "." and parts: "p1", "p2", "p3", "p4" The resulting name is "p1$p2.p3.p4" The separators are retrieved from the OpenMPIRBuilderConfig.

Definition at line 7487 of file OMPIRBuilder.cpp.

References Config, llvm::OpenMPIRBuilderConfig::firstSeparator(), and llvm::OpenMPIRBuilderConfig::separator().

Referenced by emitOffloadingArrays(), emitTargetRegionFunction(), and registerTargetGlobalVariable().

◆ createReductions()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createReductions ( const LocationDescription Loc,
InsertPointTy  AllocaIP,
ArrayRef< ReductionInfo ReductionInfos,
ArrayRef< bool IsByRef,
bool  IsNoWait = false 
)

Generator for '#omp reduction'.

Emits the IR instructing the runtime to perform the specific kind of reductions. Expects reduction variables to have been privatized and initialized to reduction-neutral values separately. Emits the calls to runtime functions as well as the reduction function and the basic blocks performing the reduction atomically and non-atomically.

The code emitted for the following:

type var_1;
type var_2;
#pragma omp <directive> reduction(reduction-op:var_1,var_2)
/* body */;

corresponds to the following sketch.

void _outlined_par() {
// N is the number of different reductions.
void *red_array[] = {privatized_var_1, privatized_var_2, ...};
switch(__kmpc_reduce(..., N, /*size of data in red array*/, red_array,
_omp_reduction_func,
_gomp_critical_user.reduction.var)) {
case 1: {
var_1 = var_1 <reduction-op> privatized_var_1;
var_2 = var_2 <reduction-op> privatized_var_2;
// ...
__kmpc_end_reduce(...);
break;
}
case 2: {
_Atomic<ReductionOp>(var_1, privatized_var_1);
_Atomic<ReductionOp>(var_2, privatized_var_2);
// ...
break;
}
default: break;
}
}
void _omp_reduction_func(void **lhs, void **rhs) {
*(type *)lhs[0] = *(type *)lhs[0] <reduction-op> *(type *)rhs[0];
*(type *)lhs[1] = *(type *)lhs[1] <reduction-op> *(type *)rhs[1];
// ...
}
#define op(i)
Straight line strength reduction
#define N
Parameters
LocThe location where the reduction was encountered. Must be within the associate directive and after the last local access to the reduction variables.
AllocaIPAn insertion point suitable for allocas usable in reductions.
ReductionInfosA list of info on each reduction variable.
IsNoWaitA flag set if the reduction is marked as nowait.
IsByRefA flag set if the reduction is using reference or direct value.

Definition at line 3662 of file OMPIRBuilder.cpp.

References llvm::all_of(), assert(), llvm::OpenMPIRBuilder::ReductionInfo::AtomicReductionGen, Builder, llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateAlloca(), llvm::IRBuilderBase::CreateBitCast(), llvm::IRBuilderBase::CreateBr(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateConstInBoundsGEP2_64(), llvm::IRBuilderBase::CreateLoad(), llvm::IRBuilderBase::CreateRetVoid(), llvm::IRBuilderBase::CreateStore(), llvm::IRBuilderBase::CreateSwitch(), llvm::IRBuilderBase::CreateUnreachable(), DL, llvm::OpenMPIRBuilder::ReductionInfo::ElementType, llvm::BasicBlock::end(), llvm::enumerate(), llvm::Instruction::eraseFromParent(), llvm::ArrayType::get(), llvm::Function::getArg(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::Module::getContext(), llvm::Module::getDataLayout(), getFreshReductionFunc(), llvm::IRBuilderBase::GetInsertBlock(), llvm::IRBuilderBase::getInt32(), llvm::IRBuilderBase::getInt64(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::BasicBlock::getParent(), llvm::IRBuilderBase::InsertPoint::getPoint(), llvm::IRBuilderBase::getPtrTy(), llvm::BasicBlock::getTerminator(), llvm::Value::getType(), llvm::OpenMPIRBuilder::LocationDescription::IP, LHS, llvm::none_of(), P, llvm::OpenMPIRBuilder::ReductionInfo::PrivateVariable, llvm::OpenMPIRBuilder::ReductionInfo::ReductionGen, llvm::IRBuilderBase::restoreIP(), RHS, llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), llvm::ArrayRef< T >::size(), llvm::BasicBlock::splitBasicBlock(), llvm::Expected< T >::takeError(), updateToLocation(), and llvm::OpenMPIRBuilder::ReductionInfo::Variable.

◆ createReductionsGPU()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createReductionsGPU ( const LocationDescription Loc,
InsertPointTy  AllocaIP,
InsertPointTy  CodeGenIP,
ArrayRef< ReductionInfo ReductionInfos,
bool  IsNoWait = false,
bool  IsTeamsReduction = false,
bool  HasDistribute = false,
ReductionGenCBKind  ReductionGenCBKind = ReductionGenCBKind::MLIR,
std::optional< omp::GV GridValue = {},
unsigned  ReductionBufNum = 1024,
Value SrcLocInfo = nullptr 
)

Design of OpenMP reductions on the GPU.

Consider a typical OpenMP program with one or more reduction clauses:

float foo; double bar; #pragma omp target teams distribute parallel for \ reduction(+:foo) reduction(*:bar) for (int i = 0; i < N; i++) { foo += A[i]; bar *= B[i]; }

where 'foo' and 'bar' are reduced across all OpenMP threads in all teams. In our OpenMP implementation on the NVPTX device an OpenMP team is mapped to a CUDA threadblock and OpenMP threads within a team are mapped to CUDA threads within a threadblock. Our goal is to efficiently aggregate values across all OpenMP threads such that:

  • the compiler and runtime are logically concise, and
  • the reduction is performed efficiently in a hierarchical manner as follows: within OpenMP threads in the same warp, across warps in a threadblock, and finally across teams on the NVPTX device.

Introduction to Decoupling

We would like to decouple the compiler and the runtime so that the latter is ignorant of the reduction variables (number, data types) and the reduction operators. This allows a simpler interface and implementation while still attaining good performance.

Pseudocode for the aforementioned OpenMP program generated by the compiler is as follows:

  1. Create private copies of reduction variables on each OpenMP thread: 'foo_private', 'bar_private'
  2. Each OpenMP thread reduces the chunk of 'A' and 'B' assigned to it and writes the result in 'foo_private' and 'bar_private' respectively.
  3. Call the OpenMP runtime on the GPU to reduce within a team and store the result on the team master:

    __kmpc_nvptx_parallel_reduce_nowait_v2(..., reduceData, shuffleReduceFn, interWarpCpyFn)

    where: struct ReduceData { double *foo; double *bar; } reduceData reduceData.foo = &foo_private reduceData.bar = &bar_private

    'shuffleReduceFn' and 'interWarpCpyFn' are pointers to two auxiliary functions generated by the compiler that operate on variables of type 'ReduceData'. They aid the runtime perform algorithmic steps in a data agnostic manner.

    'shuffleReduceFn' is a pointer to a function that reduces data of type 'ReduceData' across two OpenMP threads (lanes) in the same warp. It takes the following arguments as input:

    a. variable of type 'ReduceData' on the calling lane, b. its lane_id, c. an offset relative to the current lane_id to generate a remote_lane_id. The remote lane contains the second variable of type 'ReduceData' that is to be reduced. d. an algorithm version parameter determining which reduction algorithm to use.

    'shuffleReduceFn' retrieves data from the remote lane using efficient GPU shuffle intrinsics and reduces, using the algorithm specified by the 4th parameter, the two operands element-wise. The result is written to the first operand.

    Different reduction algorithms are implemented in different runtime functions, all calling 'shuffleReduceFn' to perform the essential reduction step. Therefore, based on the 4th parameter, this function behaves slightly differently to cooperate with the runtime to ensure correctness under different circumstances.

    'InterWarpCpyFn' is a pointer to a function that transfers reduced variables across warps. It tunnels, through CUDA shared memory, the thread-private data of type 'ReduceData' from lane 0 of each warp to a lane in the first warp.

  4. Call the OpenMP runtime on the GPU to reduce across teams. The last team writes the global reduced value to memory.

    ret = __kmpc_nvptx_teams_reduce_nowait(..., reduceData, shuffleReduceFn, interWarpCpyFn, scratchpadCopyFn, loadAndReduceFn)

    'scratchpadCopyFn' is a helper that stores reduced data from the team master to a scratchpad array in global memory.

    'loadAndReduceFn' is a helper that loads data from the scratchpad array and reduces it with the input operand.

    These compiler generated functions hide address calculation and alignment information from the runtime.

  5. if ret == 1: The team master of the last team stores the reduced result to the globals in memory. foo += reduceData.foo; bar *= reduceData.bar

Warp Reduction Algorithms

On the warp level, we have three algorithms implemented in the OpenMP runtime depending on the number of active lanes:

Full Warp Reduction

The reduce algorithm within a warp where all lanes are active is implemented in the runtime as follows:

full_warp_reduce(void *reduce_data, kmp_ShuffleReductFctPtr ShuffleReduceFn) { for (int offset = WARPSIZE/2; offset > 0; offset /= 2) ShuffleReduceFn(reduce_data, 0, offset, 0); }

The algorithm completes in log(2, WARPSIZE) steps.

'ShuffleReduceFn' is used here with lane_id set to 0 because it is not used therefore we save instructions by not retrieving lane_id from the corresponding special registers. The 4th parameter, which represents the version of the algorithm being used, is set to 0 to signify full warp reduction.

In this version, 'ShuffleReduceFn' behaves, per element, as follows:

#reduce_elem refers to an element in the local lane's data structure #remote_elem is retrieved from a remote lane remote_elem = shuffle_down(reduce_elem, offset, WARPSIZE); reduce_elem = reduce_elem REDUCE_OP remote_elem;

Contiguous Partial Warp Reduction

This reduce algorithm is used within a warp where only the first 'n' (n <= WARPSIZE) lanes are active. It is typically used when the number of OpenMP threads in a parallel region is not a multiple of WARPSIZE. The algorithm is implemented in the runtime as follows:

void contiguous_partial_reduce(void *reduce_data, kmp_ShuffleReductFctPtr ShuffleReduceFn, int size, int lane_id) { int curr_size; int offset; curr_size = size; mask = curr_size/2; while (offset>0) { ShuffleReduceFn(reduce_data, lane_id, offset, 1); curr_size = (curr_size+1)/2; offset = curr_size/2; } }

In this version, 'ShuffleReduceFn' behaves, per element, as follows:

remote_elem = shuffle_down(reduce_elem, offset, WARPSIZE); if (lane_id < offset) reduce_elem = reduce_elem REDUCE_OP remote_elem else reduce_elem = remote_elem

This algorithm assumes that the data to be reduced are located in a contiguous subset of lanes starting from the first. When there is an odd number of active lanes, the data in the last lane is not aggregated with any other lane's dat but is instead copied over.

Dispersed Partial Warp Reduction

This algorithm is used within a warp when any discontiguous subset of lanes are active. It is used to implement the reduction operation across lanes in an OpenMP simd region or in a nested parallel region.

void dispersed_partial_reduce(void *reduce_data, kmp_ShuffleReductFctPtr ShuffleReduceFn) { int size, remote_id; int logical_lane_id = number_of_active_lanes_before_me() * 2; do { remote_id = next_active_lane_id_right_after_me();

the above function returns 0 of no active lane

is present right after the current lane.

size = number_of_active_lanes_in_this_warp(); logical_lane_id /= 2; ShuffleReduceFn(reduce_data, logical_lane_id, remote_id-1-threadIdx.x, 2); } while (logical_lane_id % 2 == 0 && size > 1); }

There is no assumption made about the initial state of the reduction. Any number of lanes (>=1) could be active at any position. The reduction result is returned in the first active lane.

In this version, 'ShuffleReduceFn' behaves, per element, as follows:

remote_elem = shuffle_down(reduce_elem, offset, WARPSIZE); if (lane_id % 2 == 0 && offset > 0) reduce_elem = reduce_elem REDUCE_OP remote_elem else reduce_elem = remote_elem

Intra-Team Reduction

This function, as implemented in the runtime call '__kmpc_nvptx_parallel_reduce_nowait_v2', aggregates data across OpenMP threads in a team. It first reduces within a warp using the aforementioned algorithms. We then proceed to gather all such reduced values at the first warp.

The runtime makes use of the function 'InterWarpCpyFn', which copies data from each of the "warp master" (zeroth lane of each warp, where warp-reduced data is held) to the zeroth warp. This step reduces (in a mathematical sense) the problem of reduction across warp masters in a block to the problem of warp reduction.

Inter-Team Reduction

Once a team has reduced its data to a single value, it is stored in a global scratchpad array. Since each team has a distinct slot, this can be done without locking.

The last team to write to the scratchpad array proceeds to reduce the scratchpad array. One or more workers in the last team use the helper 'loadAndReduceDataFn' to load and reduce values from the array, i.e., the k'th worker reduces every k'th element.

Finally, a call is made to '__kmpc_nvptx_parallel_reduce_nowait_v2' to reduce across workers and compute a globally reduced value.

Parameters
LocThe location where the reduction was encountered. Must be within the associate directive and after the last local access to the reduction variables.
AllocaIPAn insertion point suitable for allocas usable in reductions.
CodeGenIPAn insertion point suitable for code generation.
ReductionInfosA list of info on each reduction variable.
IsNoWaitOptional flag set if the reduction is marked as nowait.
IsTeamsReductionOptional flag set if it is a teams reduction.
HasDistributeOptional flag set if it is a distribute reduction.
GridValueOptional GPU grid value.
ReductionBufNumOptional OpenMPCUDAReductionBufNumValue to be used for teams reduction.
SrcLocInfoSource location information global.

Definition at line 3465 of file OMPIRBuilder.cpp.

References llvm::AttrBuilder::addAttribute(), llvm::AttributeList::addFnAttributes(), assert(), Builder, checkReductionInfos(), Clang, Cond, Config, llvm::BasicBlock::Create(), llvm::StructType::create(), llvm::IRBuilderBase::CreateAlloca(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateCondBr(), llvm::IRBuilderBase::CreateICmpEQ(), llvm::IRBuilderBase::CreateInBoundsGEP(), llvm::IRBuilderBase::CreatePointerBitCastOrAddrSpaceCast(), llvm::IRBuilderBase::CreateStore(), emitBlock(), llvm::SmallVectorImpl< T >::emplace_back(), llvm::enumerate(), llvm::ArrayType::get(), llvm::Function::getAttributes(), llvm::Module::getContext(), llvm::Module::getDataLayout(), llvm::DataLayout::getDefaultGlobalsAddressSpace(), llvm::AttributeList::getFnAttrs(), getGridValue(), llvm::IRBuilderBase::getIndexTy(), llvm::IRBuilderBase::GetInsertBlock(), llvm::IRBuilderBase::getInt32(), llvm::IRBuilderBase::getInt64(), llvm::Value::getName(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), llvm::BasicBlock::getParent(), llvm::DataLayout::getTypeStoreSize(), llvm::PointerType::getUnqual(), LHS, M, llvm::OpenMPIRBuilder::ReductionInfo::PrivateVariable, llvm::OpenMPIRBuilder::ReductionInfo::ReductionGenClang, llvm::AttrBuilder::removeAttribute(), llvm::Value::replaceUsesWithIf(), llvm::IRBuilderBase::restoreIP(), RHS, llvm::IRBuilderBase::saveIP(), llvm::OpenMPIRBuilderConfig::setEmitLLVMUsed(), llvm::OpenMPIRBuilderConfig::setGridValue(), llvm::ArrayRef< T >::size(), Size, llvm::Expected< T >::takeError(), updateToLocation(), and llvm::OpenMPIRBuilder::ReductionInfo::Variable.

◆ createSection()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createSection ( const LocationDescription Loc,
BodyGenCallbackTy  BodyGenCB,
FinalizeCallbackTy  FiniCB 
)

Generator for '#omp section'.

Parameters
LocThe insert and source location description.
BodyGenCBCallback that will generate the region body code.
FiniCBCallback to finalize variable copies.
Returns
The insertion position after the section.

Definition at line 2223 of file OMPIRBuilder.cpp.

References Builder, llvm::IRBuilderBase::CreateBr(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::BasicBlock::getSinglePredecessor(), llvm::Instruction::getSuccessor(), llvm::BasicBlock::getTerminator(), I, llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::IRBuilderBase::restoreIP(), and updateToLocation().

◆ createSections()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createSections ( const LocationDescription Loc,
InsertPointTy  AllocaIP,
ArrayRef< StorableBodyGenCallbackTy SectionCBs,
PrivatizeCallbackTy  PrivCB,
FinalizeCallbackTy  FiniCB,
bool  IsCancellable,
bool  IsNowait 
)

Generator for '#omp sections'.

Parameters
LocThe insert and source location description.
AllocaIPThe insertion points to be used for alloca instructions.
SectionCBsCallbacks that will generate body of each section.
PrivCBCallback to copy a given variable (think copy constructor).
FiniCBCallback to finalize variable copies.
IsCancellableFlag to indicate a cancellable parallel region.
IsNowaitIf true, barrier - to ensure all sections are executed before moving forward will not be generated.
Returns
The insertion position after the sections.

Definition at line 2119 of file OMPIRBuilder.cpp.

References llvm::SwitchInst::addCase(), assert(), llvm::BasicBlock::begin(), Builder, llvm::Continue, llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateBr(), createCanonicalLoop(), llvm::IRBuilderBase::CreateSwitch(), llvm::OpenMPIRBuilder::LocationDescription::DL, FinalizationStack, llvm::Module::getContext(), llvm::IRBuilderBase::getInt32(), llvm::Type::getInt32Ty(), llvm::ilist_node_impl< OptionsT >::getIterator(), llvm::ilist_detail::node_parent_access< NodeTy, ParentTy >::getParent(), I, llvm::OpenMPIRBuilder::LocationDescription::IP, isConflictIP(), M, llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), llvm::ArrayRef< T >::size(), llvm::splitBBWithSuffix(), llvm::Error::success(), llvm::Expected< T >::takeError(), and updateToLocation().

◆ createSingle()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createSingle ( const LocationDescription Loc,
BodyGenCallbackTy  BodyGenCB,
FinalizeCallbackTy  FiniCB,
bool  IsNowait,
ArrayRef< llvm::Value * >  CPVars = {},
ArrayRef< llvm::Function * >  CPFuncs = {} 
)

Generator for '#omp single'.

Parameters
LocThe source location description.
BodyGenCBCallback that will generate the region code.
FiniCBCallback to finalize variable copies.
IsNowaitIf false, a barrier is emitted.
CPVarscopyprivate variables.
CPFuncscopy functions to use for each copyprivate variable.
Returns
The insertion position after the single call.

Definition at line 5641 of file OMPIRBuilder.cpp.

References Builder, llvm::IRBuilderBase::CreateAlloca(), createBarrier(), llvm::IRBuilderBase::CreateCall(), createCopyPrivate(), llvm::IRBuilderBase::CreateStore(), llvm::OpenMPIRBuilder::LocationDescription::DL, llvm::ArrayRef< T >::empty(), llvm::IRBuilderBase::getContext(), llvm::IRBuilderBase::getInt32(), llvm::Type::getInt32Ty(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), I, Int64, llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::IRBuilderBase::saveIP(), llvm::ArrayRef< T >::size(), llvm::Error::success(), llvm::Expected< T >::takeError(), and updateToLocation().

◆ createTarget()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createTarget ( const LocationDescription Loc,
bool  IsOffloadEntry,
OpenMPIRBuilder::InsertPointTy  AllocaIP,
OpenMPIRBuilder::InsertPointTy  CodeGenIP,
TargetRegionEntryInfo EntryInfo,
ArrayRef< int32_t >  NumTeams,
ArrayRef< int32_t >  NumThreads,
SmallVectorImpl< Value * > &  Inputs,
GenMapInfoCallbackTy  GenMapInfoCB,
OpenMPIRBuilder::TargetBodyGenCallbackTy  CBFunc,
OpenMPIRBuilder::TargetGenArgAccessorsCallbackTy  ArgAccessorFuncCB,
SmallVector< DependData Dependencies = {},
bool  HasNowait = false 
)

Generator for '#omp target'.

Parameters
Locwhere the target data construct was encountered.
IsOffloadEntrywhether it is an offload entry.
CodeGenIPThe insertion point where the call to the outlined function should be emitted.
EntryInfoThe entry information about the function.
NumTeamsNumber of teams specified in the num_teams clause.
NumThreadsNumber of teams specified in the thread_limit clause.
InputsThe input values to the region that will be passed. as arguments to the outlined function.
BodyGenCBCallback that will generate the region code.
ArgAccessorFuncCBCallback that will generate accessors instructions for passed in target arguments where neccessary
DependenciesA vector of DependData objects that carry

Definition at line 7440 of file OMPIRBuilder.cpp.

References Builder, Config, emitTargetCall(), emitTargetOutlinedFunction(), llvm::OpenMPIRBuilderConfig::isTargetDevice(), llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), and updateToLocation().

◆ createTargetData()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createTargetData ( const LocationDescription Loc,
InsertPointTy  AllocaIP,
InsertPointTy  CodeGenIP,
Value DeviceID,
Value IfCond,
TargetDataInfo Info,
GenMapInfoCallbackTy  GenMapInfoCB,
omp::RuntimeFunction MapperFunc = nullptr,
function_ref< InsertPointOrErrorTy(InsertPointTy CodeGenIP, BodyGenTy BodyGenType)>  BodyGenCB = nullptr,
function_ref< void(unsigned int, Value *)>  DeviceAddrCB = nullptr,
function_ref< Value *(unsigned int)>  CustomMapperCB = nullptr,
Value SrcLocInfo = nullptr 
)

Generator for '#omp target data'.

Parameters
LocThe location where the target data construct was encountered.
AllocaIPThe insertion points to be used for alloca instructions.
CodeGenIPThe insertion point at which the target directive code should be placed.
IsBeginIf true then emits begin mapper call otherwise emits end mapper call.
DeviceIDStores the DeviceID from the device clause.
IfCondValue which corresponds to the if clause condition.
InfoStores all information realted to the Target Data directive.
GenMapInfoCBCallback that populates the MapInfos and returns.
BodyGenCBOptional Callback to generate the region code.
DeviceAddrCBOptional callback to generate code related to use_device_ptr and use_device_addr.
CustomMapperCBOptional callback to generate code related to custom mappers.

Definition at line 6479 of file OMPIRBuilder.cpp.

References llvm::SmallVectorImpl< T >::append(), assert(), llvm::OpenMPIRBuilder::TargetDataRTArgs::BasePointersArray, Builder, Config, llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateLoad(), llvm::IRBuilderBase::CreateStore(), DupNoPriv, emitBlock(), emitIfClause(), emitOffloadingArrays(), emitOffloadingArraysArgument(), emitTargetTask(), llvm::SmallVectorBase< Size_T >::empty(), llvm::IRBuilderBase::getContext(), llvm::IRBuilderBase::GetInsertBlock(), llvm::IRBuilderBase::getInt32(), llvm::Constant::getNullValue(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), llvm::BasicBlock::getParent(), llvm::IRBuilderBase::getPtrTy(), Info, Int32, llvm::OpenMPIRBuilderConfig::IsTargetDevice, llvm::OpenMPIRBuilder::TargetDataRTArgs::MapNamesArray, llvm::OpenMPIRBuilder::TargetDataRTArgs::MappersArray, llvm::OpenMPIRBuilder::TargetDataRTArgs::MapTypesArray, llvm::OpenMPIRBuilder::MapInfosTy::Names, NoPriv, llvm::OpenMPIRBuilder::TargetDataRTArgs::PointersArray, Priv, llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), llvm::OpenMPIRBuilder::TargetDataRTArgs::SizesArray, llvm::Error::success(), llvm::Expected< T >::takeError(), and updateToLocation().

◆ createTargetDeinit()

void OpenMPIRBuilder::createTargetDeinit ( const LocationDescription Loc,
int32_t  TeamsReductionDataSize = 0,
int32_t  TeamsReductionBufferLength = 1024 
)

Create a runtime call for kmpc_target_deinit.

Parameters
LocThe insert and source location description.
TeamsReductionDataSizeThe maximal size of all the reduction data for teams reduction.
TeamsReductionBufferLengthThe number of elements (each of up to TeamsReductionDataSize size), in the teams reduction buffer.

Definition at line 6265 of file OMPIRBuilder.cpp.

References assert(), Builder, llvm::ConstantFoldInsertValueInstruction(), llvm::IRBuilderBase::CreateCall(), llvm::StringRef::drop_back(), llvm::StringRef::ends_with(), llvm::IRBuilderBase::GetInsertBlock(), llvm::Value::getName(), llvm::Module::getNamedGlobal(), getOrCreateRuntimeFunctionPtr(), llvm::BasicBlock::getParent(), Int32, M, and updateToLocation().

Referenced by createOutlinedFunction().

◆ createTargetInit()

OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::createTargetInit ( const LocationDescription Loc,
bool  IsSPMD,
int32_t  MinThreadsVal = 0,
int32_t  MaxThreadsVal = 0,
int32_t  MinTeamsVal = 0,
int32_t  MaxTeamsVal = 0 
)

The omp target interface.

For more information about the usage of this interface,

See also
openmp/libomptarget/deviceRTLs/common/include/target.h

{ Create a runtime call for kmpc_target_init

Parameters
LocThe insert and source location description.
IsSPMDFlag to indicate if the kernel is an SPMD kernel or not.
MinThreadsMinimal number of threads, or 0.
MaxThreadsMaximal number of threads, or 0.
MinTeamsMinimal number of teams, or 0.
MaxTeamsMaximal number of teams, or 0.

Definition at line 6131 of file OMPIRBuilder.cpp.

References assert(), Builder, llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateCondBr(), llvm::IRBuilderBase::CreateICmpEQ(), llvm::IRBuilderBase::CreateRetVoid(), llvm::IRBuilderBase::CreateUnreachable(), DL, llvm::StringRef::drop_back(), llvm::StringRef::ends_with(), llvm::Instruction::eraseFromParent(), llvm::ConstantStruct::get(), llvm::ConstantExpr::getAddrSpaceCast(), llvm::Constant::getAllOnesValue(), llvm::Function::getArg(), llvm::BasicBlock::getContext(), llvm::Function::getDataLayout(), llvm::BasicBlock::getFirstInsertionPt(), llvm::Module::getFunction(), getGridValue(), llvm::IRBuilderBase::GetInsertBlock(), llvm::Value::getName(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), llvm::BasicBlock::getParent(), llvm::ConstantInt::getSigned(), llvm::BasicBlock::getTerminator(), llvm::GlobalValue::getType(), llvm::Value::getType(), Int16, Int32, llvm::OpenMPIRBuilder::LocationDescription::IP, M, MaxThreads, llvm::GlobalValue::NotThreadLocal, llvm::omp::OMP_TGT_EXEC_MODE_GENERIC, llvm::omp::OMP_TGT_EXEC_MODE_SPMD, llvm::GlobalValue::ProtectedVisibility, llvm::IRBuilderBase::SetInsertPoint(), llvm::GlobalValue::setVisibility(), llvm::BasicBlock::splitBasicBlock(), updateToLocation(), llvm::GlobalValue::WeakODRLinkage, writeTeamsForKernel(), and writeThreadBoundsForKernel().

Referenced by createOutlinedFunction().

◆ createTask()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createTask ( const LocationDescription Loc,
InsertPointTy  AllocaIP,
BodyGenCallbackTy  BodyGenCB,
bool  Tied = true,
Value Final = nullptr,
Value IfCondition = nullptr,
SmallVector< DependData Dependencies = {},
bool  Mergeable = false,
Value EventHandle = nullptr 
)

Generator for #omp task

Parameters
LocThe location where the task construct was encountered.
AllocaIPThe insertion point to be used for alloca instructions.
BodyGenCBCallback that will generate the region code.
TiedTrue if the task is tied, false if the task is untied.
Finali1 value which is true if the task is final, false if the task is not final.
IfConditioni1 value. If it evaluates to false, an undeferred task is generated, and the encountering thread must suspend the current task region, for which execution cannot be resumed until execution of the structured block that is associated with the generated task is completed.
EventHandleIf present, signifies the event handle as part of the detach clause
MergeableIf the given task is mergeable

Definition at line 1818 of file OMPIRBuilder.cpp.

References addOutlineInfo(), Addr, llvm::CallBase::arg_size(), assert(), llvm::BasicBlock::back(), llvm::sampleprof::Base, llvm::BasicBlock::begin(), Builder, llvm::IRBuilderBase::CreateAlloca(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateConstInBoundsGEP2_64(), createFakeIntVal(), llvm::IRBuilderBase::CreateLoad(), llvm::IRBuilderBase::CreateMemCpy(), llvm::IRBuilderBase::CreateOr(), llvm::IRBuilderBase::CreatePointerBitCastOrAddrSpaceCast(), llvm::IRBuilderBase::CreatePtrToInt(), llvm::IRBuilderBase::CreateSelect(), llvm::IRBuilderBase::CreateStore(), llvm::IRBuilderBase::CreateStructGEP(), llvm::divideCeil(), llvm::OpenMPIRBuilder::OutlineInfo::EntryBB, llvm::Instruction::eraseFromParent(), llvm::OpenMPIRBuilder::OutlineInfo::ExcludeArgsFromAggregate, llvm::OpenMPIRBuilder::OutlineInfo::ExitBB, llvm::omp::Flags, llvm::ConstantPointerNull::get(), llvm::ArrayType::get(), llvm::AllocaInst::getAllocatedType(), llvm::CallBase::getArgOperand(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::Module::getContext(), llvm::Module::getDataLayout(), llvm::Instruction::getDebugLoc(), llvm::Function::getEntryBlock(), llvm::IRBuilderBase::GetInsertPoint(), llvm::IRBuilderBase::getInt32(), llvm::IRBuilderBase::getInt32Ty(), llvm::IRBuilderBase::getInt64(), llvm::IRBuilderBase::getInt64Ty(), llvm::IRBuilderBase::getInt8Ty(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::BasicBlock::getParent(), llvm::Value::getPointerAlignment(), llvm::IRBuilderBase::getPtrTy(), llvm::DataLayout::getTypeSizeInBits(), llvm::DataLayout::getTypeStoreSize(), llvm::PointerType::getUnqual(), I, M, llvm::OpenMPIRBuilder::OutlineInfo::OuterAllocaBB, P, llvm::OpenMPIRBuilder::OutlineInfo::PostOutlineCB, llvm::Value::replaceUsesWithIf(), llvm::IRBuilderBase::restoreIP(), llvm::reverse(), llvm::IRBuilderBase::saveIP(), llvm::Instruction::setDebugLoc(), llvm::IRBuilderBase::SetInsertPoint(), llvm::SmallVectorBase< Size_T >::size(), Size, llvm::splitBB(), llvm::SplitBlockAndInsertIfThenElse(), and updateToLocation().

◆ createTaskgroup()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createTaskgroup ( const LocationDescription Loc,
InsertPointTy  AllocaIP,
BodyGenCallbackTy  BodyGenCB 
)

Generator for the taskgroup construct.

Parameters
LocThe location where the taskgroup construct was encountered.
AllocaIPThe insertion point to be used for alloca instructions.
BodyGenCBCallback that will generate the region code.

Definition at line 2090 of file OMPIRBuilder.cpp.

References Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), llvm::splitBB(), and updateToLocation().

◆ createTaskwait()

void OpenMPIRBuilder::createTaskwait ( const LocationDescription Loc)

Generator for '#omp taskwait'.

Parameters
LocThe location where the taskwait directive was encountered.

Definition at line 1726 of file OMPIRBuilder.cpp.

References emitTaskwaitImpl(), and updateToLocation().

◆ createTaskyield()

void OpenMPIRBuilder::createTaskyield ( const LocationDescription Loc)

Generator for '#omp taskyield'.

Parameters
LocThe location where the taskyield directive was encountered.

Definition at line 1744 of file OMPIRBuilder.cpp.

References emitTaskyieldImpl(), and updateToLocation().

◆ createTeams()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::createTeams ( const LocationDescription Loc,
BodyGenCallbackTy  BodyGenCB,
Value NumTeamsLower = nullptr,
Value NumTeamsUpper = nullptr,
Value ThreadLimit = nullptr,
Value IfExpr = nullptr 
)

Generator for #omp teams

Parameters
LocThe location where the teams construct was encountered.
BodyGenCBCallback that will generate the region code.
NumTeamsLowerLower bound on number of teams. If this is nullptr, it is as if lower bound is specified as equal to upperbound. If this is non-null, then upperbound must also be non-null.
NumTeamsUpperUpper bound on the number of teams.
ThreadLimiton the number of threads that may participate in a contention group created by each team.
IfExpris the integer argument value of the if condition on the teams clause.

Definition at line 8867 of file OMPIRBuilder.cpp.

References addOutlineInfo(), llvm::Function::arg_size(), llvm::CallBase::arg_size(), assert(), llvm::BasicBlock::begin(), Builder, Config, llvm::IRBuilderBase::CreateCall(), createFakeIntVal(), llvm::IRBuilderBase::CreateICmpNE(), llvm::IRBuilderBase::CreateSelect(), llvm::OpenMPIRBuilder::OutlineInfo::EntryBB, llvm::OpenMPIRBuilder::OutlineInfo::ExcludeArgsFromAggregate, llvm::OpenMPIRBuilder::OutlineInfo::ExitBB, llvm::Function::getArg(), llvm::CallBase::getArgOperand(), llvm::Function::getEntryBlock(), llvm::IRBuilderBase::GetInsertBlock(), llvm::IRBuilderBase::getInt32(), llvm::Value::getNumUses(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::BasicBlock::getParent(), llvm::Value::getType(), I, Int1, llvm::Type::isIntegerTy(), llvm::OpenMPIRBuilderConfig::isTargetDevice(), llvm::OpenMPIRBuilder::OutlineInfo::OuterAllocaBB, llvm::OpenMPIRBuilder::OutlineInfo::PostOutlineCB, llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::reverse(), llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), llvm::Value::setName(), llvm::splitBB(), updateToLocation(), and llvm::Value::user_back().

◆ emitBlock()

void OpenMPIRBuilder::emitBlock ( BasicBlock BB,
Function CurFn,
bool  IsFinished = false 
)

◆ emitBranch()

void OpenMPIRBuilder::emitBranch ( BasicBlock Target)

◆ emitCancelationCheckImpl()

Error OpenMPIRBuilder::emitCancelationCheckImpl ( Value CancelFlag,
omp::Directive  CanceledDirective,
FinalizeCallbackTy  ExitCB = {} 
)

Generate control flow and cleanup for cancellation.

Parameters
CancelFlagFlag indicating if the cancellation is performed.
CanceledDirectiveThe kind of directive that is cancled.
ExitCBExtra code to be generated in the exit block.
Returns
an error, if any were triggered during execution.

Definition at line 1150 of file OMPIRBuilder.cpp.

References assert(), llvm::BasicBlock::begin(), Builder, llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateCondBr(), llvm::IRBuilderBase::CreateIsNull(), llvm::BasicBlock::end(), llvm::Instruction::eraseFromParent(), FinalizationStack, llvm::BasicBlock::getContext(), llvm::IRBuilderBase::GetInsertBlock(), llvm::IRBuilderBase::GetInsertPoint(), llvm::Value::getName(), llvm::BasicBlock::getParent(), llvm::BasicBlock::getTerminator(), isLastFinalizationInfoCancellable(), llvm::IRBuilderBase::saveIP(), llvm::IRBuilderBase::SetInsertPoint(), llvm::SplitBlock(), and llvm::Error::success().

Referenced by createBarrier(), and createCancel().

◆ emitFlush()

void OpenMPIRBuilder::emitFlush ( const LocationDescription Loc)

Generate a flush runtime call.

Parameters
LocThe location at which the request originated and is fulfilled.

Definition at line 1698 of file OMPIRBuilder.cpp.

References Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), and getOrCreateSrcLocStr().

Referenced by createFlush().

◆ emitIfClause()

Error OpenMPIRBuilder::emitIfClause ( Value Cond,
BodyGenCallbackTy  ThenGen,
BodyGenCallbackTy  ElseGen,
InsertPointTy  AllocaIP = {} 
)

Emits code for OpenMP 'if' clause using specified BodyGenCallbackTy Here is the logic: if (Cond) { ThenGen(); } else { ElseGen(); }.

Returns
an error, if any were triggered during execution.

Definition at line 8240 of file OMPIRBuilder.cpp.

References Builder, Cond, llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateCondBr(), emitBlock(), emitBranch(), llvm::Module::getContext(), llvm::IRBuilderBase::GetInsertBlock(), llvm::BasicBlock::getParent(), M, llvm::IRBuilderBase::saveIP(), and llvm::Error::success().

Referenced by createTargetData().

◆ emitKernelLaunch()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::emitKernelLaunch ( const LocationDescription Loc,
Value OutlinedFnID,
EmitFallbackCallbackTy  EmitTargetCallFallbackCB,
TargetKernelArgs Args,
Value DeviceID,
Value RTLoc,
InsertPointTy  AllocaIP 
)

Generate a target region entry call and host fallback call.

Parameters
LocThe location at which the request originated and is fulfilled.
OutlinedFnIDThe ooulined function ID.
EmitTargetCallFallbackCBCall back function to generate host fallback code.
ArgsData structure holding information about the kernel arguments.
DeviceIDIdentifier for the device via the 'device' clause.
RTLocSource location identifier
AllocaIPThe insertion point to be used for alloca instructions.

Definition at line 1086 of file OMPIRBuilder.cpp.

References assert(), Builder, llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateCondBr(), llvm::IRBuilderBase::CreateIsNotNull(), emitBlock(), emitBranch(), emitTargetKernel(), llvm::Failed(), llvm::IRBuilderBase::getContext(), llvm::IRBuilderBase::GetInsertBlock(), getKernelArgsVector(), llvm::BasicBlock::getParent(), llvm::OpenMPIRBuilder::LocationDescription::IP, llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), llvm::Expected< T >::takeError(), and updateToLocation().

◆ emitMapperCall()

void OpenMPIRBuilder::emitMapperCall ( const LocationDescription Loc,
Function MapperFunc,
Value SrcLocInfo,
Value MaptypesArg,
Value MapnamesArg,
struct MapperAllocas MapperAllocas,
int64_t  DeviceID,
unsigned  NumOperands 
)

Create the call for the target mapper function.

Parameters
LocThe source location description.
MapperFuncFunction to be called.
SrcLocInfoSource location information global.
MaptypesArgThe argument types.
MapnamesArgThe argument names.
MapperAllocasThe AllocaInst used for the call.
DeviceIDDevice ID for the call.
NumOperandsNumber of operands in the call.

Definition at line 7572 of file OMPIRBuilder.cpp.

References llvm::OpenMPIRBuilder::MapperAllocas::Args, llvm::OpenMPIRBuilder::MapperAllocas::ArgsBase, llvm::OpenMPIRBuilder::MapperAllocas::ArgSizes, Builder, llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateInBoundsGEP(), llvm::ArrayType::get(), llvm::IRBuilderBase::getInt32(), llvm::IRBuilderBase::getInt64(), llvm::Constant::getNullValue(), llvm::PointerType::getUnqual(), Int64, and updateToLocation().

◆ emitNonContiguousDescriptor()

void OpenMPIRBuilder::emitNonContiguousDescriptor ( InsertPointTy  AllocaIP,
InsertPointTy  CodeGenIP,
MapInfosTy CombinedInfo,
TargetDataInfo Info 
)

◆ emitOffloadingArrays()

void OpenMPIRBuilder::emitOffloadingArrays ( InsertPointTy  AllocaIP,
InsertPointTy  CodeGenIP,
MapInfosTy CombinedInfo,
TargetDataInfo Info,
bool  IsNonContiguous = false,
function_ref< void(unsigned int, Value *)>  DeviceAddrCB = nullptr,
function_ref< Value *(unsigned int)>  CustomMapperCB = nullptr 
)

Emit the arrays used to pass the captures and map information to the offloading runtime library.

If there is no map or capture information, return nullptr by reference. Accepts a reference to a MapInfosTy object that contains information generated for mappable clauses, including base pointers, pointers, sizes, map types, user-defined mappers.

Definition at line 8014 of file OMPIRBuilder.cpp.

References Address, llvm::SmallBitVector::all(), llvm::SmallBitVector::any(), llvm::OpenMPIRBuilder::MapInfosTy::BasePointers, Builder, llvm::IRBuilderBase::CreateAlignedStore(), llvm::IRBuilderBase::CreateAlloca(), llvm::IRBuilderBase::CreateConstInBoundsGEP2_32(), llvm::IRBuilderBase::CreateInBoundsGEP(), llvm::IRBuilderBase::CreateIntCast(), llvm::IRBuilderBase::CreateMemCpy(), createOffloadMapnames(), createOffloadMaptypes(), createPlatformSpecificName(), llvm::IRBuilderBase::CreatePointerCast(), llvm::OpenMPIRBuilder::MapInfosTy::DevicePointers, llvm::OpenMPIRBuilder::MapInfosTy::StructNonContiguousInfo::Dims, emitNonContiguousDescriptor(), llvm::SmallVectorBase< Size_T >::empty(), llvm::ConstantArray::get(), llvm::ConstantPointerNull::get(), llvm::ArrayType::get(), llvm::DataLayout::getABIIntegerTypeAlignment(), llvm::AllocaInst::getAllocatedType(), llvm::AllocaInst::getAllocationSize(), llvm::IRBuilderBase::getContext(), llvm::Module::getDataLayout(), llvm::DataLayout::getIndexSizeInBits(), llvm::IRBuilderBase::getInt64Ty(), llvm::IRBuilderBase::getIntN(), llvm::Constant::getNullValue(), llvm::DataLayout::getPrefTypeAlign(), llvm::IRBuilderBase::getPtrTy(), llvm::AllocaInst::getType(), llvm::Value::getType(), llvm::PointerType::getUnqual(), llvm::GlobalValue::Global, I, Info, M, Name, llvm::OpenMPIRBuilder::MapInfosTy::Names, llvm::OpenMPIRBuilder::MapInfosTy::NonContigInfo, llvm::OpenMPIRBuilder::MapInfosTy::StructNonContiguousInfo::Offsets, P, Pointer, llvm::OpenMPIRBuilder::MapInfosTy::Pointers, llvm::GlobalValue::PrivateLinkage, llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), llvm::SmallBitVector::set(), llvm::AllocaInst::setAlignment(), llvm::SmallVectorBase< Size_T >::size(), llvm::OpenMPIRBuilder::MapInfosTy::Sizes, llvm::SmallBitVector::test(), and llvm::OpenMPIRBuilder::MapInfosTy::Types.

Referenced by createTargetData(), and emitOffloadingArraysAndArgs().

◆ emitOffloadingArraysAndArgs()

void OpenMPIRBuilder::emitOffloadingArraysAndArgs ( InsertPointTy  AllocaIP,
InsertPointTy  CodeGenIP,
TargetDataInfo Info,
TargetDataRTArgs RTArgs,
MapInfosTy CombinedInfo,
bool  IsNonContiguous = false,
bool  ForEndCall = false,
function_ref< void(unsigned int, Value *)>  DeviceAddrCB = nullptr,
function_ref< Value *(unsigned int)>  CustomMapperCB = nullptr 
)

Allocates memory for and populates the arrays required for offloading (offload_{baseptrs|ptrs|mappers|sizes|maptypes|mapnames}).

Then, it emits their base addresses as arguments to be passed to the runtime library. In essence, this function is a combination of emitOffloadingArrays and emitOffloadingArraysArgument and should arguably be preferred by clients of OpenMPIRBuilder.

Definition at line 7305 of file OMPIRBuilder.cpp.

References Builder, emitOffloadingArrays(), emitOffloadingArraysArgument(), and Info.

◆ emitOffloadingArraysArgument()

void OpenMPIRBuilder::emitOffloadingArraysArgument ( IRBuilderBase Builder,
OpenMPIRBuilder::TargetDataRTArgs RTArgs,
OpenMPIRBuilder::TargetDataInfo Info,
bool  ForEndCall = false 
)

◆ emitTargetKernel()

OpenMPIRBuilder::InsertPointTy OpenMPIRBuilder::emitTargetKernel ( const LocationDescription Loc,
InsertPointTy  AllocaIP,
Value *&  Return,
Value Ident,
Value DeviceID,
Value NumTeams,
Value NumThreads,
Value HostPtr,
ArrayRef< Value * >  KernelArgs 
)

Generate a target region entry call.

Parameters
LocThe location at which the request originated and is fulfilled.
AllocaIPThe insertion point to be used for alloca instructions.
ReturnReturn value of the created function returned by reference.
DeviceIDIdentifier for the device via the 'device' clause.
NumTeamsNumer of teams for the region via the 'num_teams' clause or 0 if unspecified and -1 if there is no 'teams' clause.
NumThreadsNumber of threads via the 'thread_limit' clause.
HostPtrPointer to the host-side pointer of the target kernel.
KernelArgsArray of arguments to the kernel.

Definition at line 1056 of file OMPIRBuilder.cpp.

References Builder, llvm::IRBuilderBase::CreateAlignedStore(), llvm::IRBuilderBase::CreateAlloca(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateStructGEP(), llvm::Module::getDataLayout(), getOrCreateRuntimeFunction(), llvm::DataLayout::getPrefTypeAlign(), I, llvm::OpenMPIRBuilder::LocationDescription::IP, M, llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), llvm::ArrayRef< T >::size(), Size, and updateToLocation().

Referenced by emitKernelLaunch().

◆ emitTargetRegionFunction()

Error OpenMPIRBuilder::emitTargetRegionFunction ( TargetRegionEntryInfo EntryInfo,
FunctionGenCallback GenerateFunctionCallback,
bool  IsOffloadEntry,
Function *&  OutlinedFn,
Constant *&  OutlinedFnID 
)

Create a unique name for the entry function using the source location information of the current target region.

The name will be something like:

__omp_offloading_DD_FFFF_PP_lBB[_CC]

where DD_FFFF is an ID unique to the file (device and file IDs), PP is the mangled name of the function that encloses the target region and BB is the line number of the target region. CC is a count added when more than one region is located at the same location.

If this target outline function is not an offload entry, we don't need to register it. This may happen if it is guarded by an if clause that is false at compile time, or no target archs have been specified.

The created target region ID is used by the runtime library to identify the current target region, so it only has to be unique and not necessarily point to anything. It could be the pointer to the outlined function that implements the target region, but we aren't using that so that the compiler doesn't need to keep that, and could therefore inline the host function if proven worthwhile during optimization. In the other hand, if emitting code for the device, the ID has to be the function address so that it can retrieved from the offloading entry and launched by the runtime library. We also mark the outlined function to have external linkage in case we are emitting code for the device, because these functions will be entry points to the device.

Parameters
InfoManagerThe info manager keeping track of the offload entries
EntryInfoThe entry information about the function
GenerateFunctionCallbackThe callback function to generate the code
OutlinedFunctionPointer to the outlined function
EntryFnIDNameName of the ID o be created

Definition at line 6433 of file OMPIRBuilder.cpp.

References Config, createPlatformSpecificName(), llvm::OffloadEntriesInfoManager::getTargetRegionEntryFnName(), llvm::OpenMPIRBuilderConfig::isTargetDevice(), OffloadInfoManager, llvm::OpenMPIRBuilderConfig::openMPOffloadMandatory(), registerTargetRegionFunction(), llvm::Error::success(), and llvm::Expected< T >::takeError().

Referenced by emitTargetOutlinedFunction().

◆ emitTargetTask()

OpenMPIRBuilder::InsertPointOrErrorTy OpenMPIRBuilder::emitTargetTask ( TargetTaskBodyCallbackTy  TaskBodyCB,
Value DeviceID,
Value RTLoc,
OpenMPIRBuilder::InsertPointTy  AllocaIP,
const SmallVector< llvm::OpenMPIRBuilder::DependData > &  Dependencies,
bool  HasNoWait 
)

Generate a target-task for the target construct.

Parameters
TaskBodyCBCallback to generate the actual body of the target task.
DeviceIDIdentifier for the device via the 'device' clause.
RTLocSource location identifier
AllocaIPThe insertion point to be used for alloca instructions.
DependenciesVector of DependData objects holding information of dependencies as specified by the 'depend' clause.
HasNoWaitTrue if the target construct had 'nowait' on it, false otherwise

Definition at line 7029 of file OMPIRBuilder.cpp.

References addOutlineInfo(), llvm::CallBase::arg_size(), assert(), llvm::BasicBlock::begin(), Builder, llvm::IRBuilderBase::CreateCall(), createFakeIntVal(), llvm::IRBuilderBase::CreateLoad(), llvm::IRBuilderBase::CreateMemCpy(), llvm::dbgs(), emitTargetTaskProxyFunction(), emitTaskDependencies(), llvm::OpenMPIRBuilder::OutlineInfo::EntryBB, llvm::Instruction::eraseFromParent(), llvm::OpenMPIRBuilder::OutlineInfo::ExcludeArgsFromAggregate, llvm::OpenMPIRBuilder::OutlineInfo::ExitBB, llvm::omp::Flags, llvm::ConstantPointerNull::get(), llvm::CallBase::getArgOperand(), llvm::IRBuilderBase::InsertPoint::getBlock(), llvm::Module::getContext(), llvm::Module::getDataLayout(), llvm::Instruction::getDebugLoc(), llvm::IRBuilderBase::GetInsertBlock(), llvm::IRBuilderBase::getInt32(), llvm::IRBuilderBase::getInt32Ty(), llvm::IRBuilderBase::getInt64(), llvm::Value::getNumUses(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), llvm::GlobalValue::getParent(), llvm::BasicBlock::getParent(), llvm::Value::getPointerAlignment(), llvm::DataLayout::getTypeStoreSize(), llvm::PointerType::getUnqual(), I, LLVM_DEBUG, M, llvm::OpenMPIRBuilder::OutlineInfo::OuterAllocaBB, llvm::OpenMPIRBuilder::OutlineInfo::PostOutlineCB, llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::IRBuilderBase::restoreIP(), llvm::reverse(), llvm::IRBuilderBase::saveIP(), llvm::Instruction::setDebugLoc(), llvm::IRBuilderBase::SetInsertPoint(), llvm::SmallVectorBase< Size_T >::size(), llvm::splitBB(), and llvm::Value::user_back().

Referenced by createTargetData().

◆ emitTaskwaitImpl()

void OpenMPIRBuilder::emitTaskwaitImpl ( const LocationDescription Loc)

Generate a taskwait runtime call.

Parameters
LocThe location at which the request originated and is fulfilled.

Definition at line 1713 of file OMPIRBuilder.cpp.

References Builder, llvm::IRBuilderBase::CreateCall(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), and getOrCreateThreadID().

Referenced by createTaskwait().

◆ emitTaskyieldImpl()

void OpenMPIRBuilder::emitTaskyieldImpl ( const LocationDescription Loc)

Generate a taskyield runtime call.

Parameters
LocThe location at which the request originated and is fulfilled.

Definition at line 1732 of file OMPIRBuilder.cpp.

References Builder, llvm::IRBuilderBase::CreateCall(), llvm::Constant::getNullValue(), getOrCreateIdent(), getOrCreateRuntimeFunctionPtr(), getOrCreateSrcLocStr(), getOrCreateThreadID(), and Int32.

Referenced by createTaskyield().

◆ emitUserDefinedMapper()

Function * OpenMPIRBuilder::emitUserDefinedMapper ( function_ref< MapInfosTy &(InsertPointTy CodeGenIP, llvm::Value *PtrPHI, llvm::Value *BeginArg)>  PrivAndGenMapInfoCB,
llvm::Type ElemTy,
StringRef  FuncName,
function_ref< bool(unsigned int, Function **)>  CustomMapperCB = nullptr 
)

Emit the user-defined mapper function.

The code generation follows the pattern in the example below.

void .omp_mapper.<type_name>.<mapper_id>.(void *rt_mapper_handle,
void *base, void *begin,
int64_t size, int64_t type,
void *name = nullptr) {
// Allocate space for an array section first or add a base/begin for
// pointer dereference.
if ((size > 1 || (base != begin && maptype.IsPtrAndObj)) &&
!maptype.IsDelete)
__tgt_push_mapper_component(rt_mapper_handle, base, begin,
size*sizeof(Ty), clearToFromMember(type));
// Map members.
for (unsigned i = 0; i < size; i++) {
// For each component specified by this mapper:
for (auto c : begin[i]->all_components) {
if (c.hasMapper())
(*c.Mapper())(rt_mapper_handle, c.arg_base, c.arg_begin,
c.arg_size,
c.arg_type, c.arg_name);
else
__tgt_push_mapper_component(rt_mapper_handle, c.arg_base,
c.arg_begin, c.arg_size, c.arg_type,
c.arg_name);
}
}
// Delete the array section.
if (size > 1 && maptype.IsDelete)
__tgt_push_mapper_component(rt_mapper_handle, base, begin,
size*sizeof(Ty), clearToFromMember(type));
}
static const char * name
Definition: SMEABIPass.cpp:46
auto size(R &&Range, std::enable_if_t< std::is_base_of< std::random_access_iterator_tag, typename std::iterator_traits< decltype(Range.begin())>::iterator_category >::value, void > *=nullptr)
Get the size of a range.
Definition: STLExtras.h:1697
Parameters
PrivAndGenMapInfoCBCallback that privatizes code and populates the MapInfos and returns.
ElemTyDeclareMapper element type.
FuncNameOptional param to specify mapper function name.
CustomMapperCBOptional callback to generate code related to custom mappers.

Definition at line 7794 of file OMPIRBuilder.cpp.

References llvm::Function::addFnAttr(), llvm::PHINode::addIncoming(), llvm::Function::addParamAttr(), Builder, llvm::Function::Create(), llvm::BasicBlock::Create(), llvm::IRBuilderBase::CreateAnd(), llvm::IRBuilderBase::CreateBitCast(), llvm::IRBuilderBase::CreateBr(), llvm::IRBuilderBase::CreateCall(), llvm::IRBuilderBase::CreateCondBr(), llvm::IRBuilderBase::CreateConstGEP1_32(), llvm::IRBuilderBase::CreateExactUDiv(), llvm::IRBuilderBase::CreateGEP(), llvm::IRBuilderBase::CreateICmpEQ(), llvm::IRBuilderBase::CreateIsNull(), llvm::IRBuilderBase::CreateNUWAdd(), llvm::IRBuilderBase::CreatePHI(), llvm::IRBuilderBase::CreateRetVoid(), llvm::IRBuilderBase::CreateShl(), emitBlock(), llvm::SmallVectorImpl< T >::emplace_back(), llvm::FunctionType::get(), llvm::Function::getArg(), llvm::Module::getContext(), llvm::Module::getDataLayout(), getFlagMemberOffset(), llvm::IRBuilderBase::getInt64(), llvm::IRBuilderBase::getInt64Ty(), llvm::Constant::getNullValue(), getOrCreateRuntimeFunction(), llvm::IRBuilderBase::getPtrTy(), llvm::Value::getType(), llvm::DataLayout::getTypeStoreSize(), llvm::IRBuilderBase::getVoidTy(), I, Info, llvm::GlobalValue::InternalLinkage, M, llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::saveIP(), llvm::CallBase::setDoesNotThrow(), llvm::IRBuilderBase::SetInsertPoint(), and Size.

◆ finalize()

void OpenMPIRBuilder::finalize ( Function Fn = nullptr)

◆ getAddrOfDeclareTargetVar()

Constant * OpenMPIRBuilder::getAddrOfDeclareTargetVar ( OffloadEntriesInfoManager::OMPTargetGlobalVarEntryKind  CaptureClause,
OffloadEntriesInfoManager::OMPTargetDeviceClauseKind  DeviceClause,
bool  IsDeclaration,
bool  IsExternallyVisible,
TargetRegionEntryInfo  EntryInfo,
StringRef  MangledName,
std::vector< GlobalVariable * > &  GeneratedRefs,
bool  OpenMPSIMD,
std::vector< Triple TargetTriple,
Type LlvmPtrTy,
std::function< Constant *()>  GlobalInitializer,
std::function< GlobalValue::LinkageTypes()>  VariableLinkage 
)

Retrieve (or create if non-existent) the address of a declare target variable, used in conjunction with registerTargetGlobalVariable to create declare target global variables.

Parameters
CaptureClause- enumerator corresponding to the OpenMP capture clause used in conjunction with the variable being registered (link, to, enter).
DeviceClause- enumerator corresponding to the OpenMP capture clause used in conjunction with the variable being registered (nohost, host, any)
IsDeclaration- boolean stating if the variable being registered is a declaration-only and not a definition
IsExternallyVisible- boolean stating if the variable is externally visible
EntryInfo- Unique entry information for the value generated using getTargetEntryUniqueInfo, used to name generated pointer references to the declare target variable
MangledName- the mangled name of the variable being registered
GeneratedRefs- references generated by invocations of registerTargetGlobalVariable invoked from getAddrOfDeclareTargetVar, these are required by Clang for book keeping.
OpenMPSIMD- if OpenMP SIMD mode is currently enabled
TargetTriple- The OpenMP device target triple we are compiling for
LlvmPtrTy- The type of the variable we are generating or retrieving an address for
GlobalInitializer- a lambda function which creates a constant used for initializing a pointer reference to the variable in certain cases. If a nullptr is passed, it will default to utilising the original variable to initialize the pointer reference.
VariableLinkage- a lambda function which returns the variables linkage type, if unspecified and a nullptr is given, it will instead utilise the linkage stored on the existing global variable in the LLVMModule.

Definition at line 9332 of file OMPIRBuilder.cpp.

References Config, llvm::TargetRegionEntryInfo::FileID, llvm::format(), llvm::Module::getNamedValue(), getOrCreateInternalVariable(), llvm::OpenMPIRBuilderConfig::hasRequiresUnifiedSharedMemory(), llvm::OpenMPIRBuilderConfig::isTargetDevice(), M, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryEnter, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryLink, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryTo, OS, Ptr, registerTargetGlobalVariable(), and llvm::GlobalValue::WeakAnyLinkage.

Referenced by registerTargetGlobalVariable().

◆ getFlagMemberOffset()

unsigned OpenMPIRBuilder::getFlagMemberOffset ( )

Get the offset of the OMP_MAP_MEMBER_OF field.

Definition at line 9296 of file OMPIRBuilder.cpp.

References llvm::Offset, and llvm::omp::OMP_MAP_MEMBER_OF.

Referenced by emitUserDefinedMapper(), and getMemberOfFlag().

◆ getInsertionPoint()

InsertPointTy llvm::OpenMPIRBuilder::getInsertionPoint ( )
inline

}

Return the insertion point used by the underlying IRBuilder.

Definition at line 1970 of file OMPIRBuilder.h.

References Builder, and llvm::IRBuilderBase::saveIP().

◆ getKernelArgsVector()

void OpenMPIRBuilder::getKernelArgsVector ( TargetKernelArgs KernelArgs,
IRBuilderBase Builder,
SmallVector< Value * > &  ArgsVector 
)
static

◆ getMemberOfFlag()

omp::OpenMPOffloadMappingFlags OpenMPIRBuilder::getMemberOfFlag ( unsigned  Position)

Get OMP_MAP_MEMBER_OF flag with extra bits reserved based on the position given.

Parameters
Position- A value indicating the position of the parent of the member in the kernel argument structure, often retrieved by the parents position in the combined information vectors used to generate the structure itself. Multiple children (member's of) with the same parent will use the same returned member flag.

Definition at line 9307 of file OMPIRBuilder.cpp.

References getFlagMemberOffset().

◆ getOpenMPDefaultSimdAlign()

unsigned OpenMPIRBuilder::getOpenMPDefaultSimdAlign ( const Triple TargetTriple,
const StringMap< bool > &  Features 
)
static

Get the default alignment value for given target.

Parameters
TargetTripleTarget triple
FeaturesStringMap which describes extra CPU features

Definition at line 5266 of file OMPIRBuilder.cpp.

References llvm::Triple::isPPC(), llvm::Triple::isWasm(), llvm::Triple::isX86(), and llvm::StringMap< ValueTy, AllocatorTy >::lookup().

◆ getOrCreateDefaultSrcLocStr()

Constant * OpenMPIRBuilder::getOrCreateDefaultSrcLocStr ( uint32_t SrcLocStrSize)

Return the (LLVM-IR) string describing the default source location.

Definition at line 913 of file OMPIRBuilder.cpp.

References getOrCreateSrcLocStr().

Referenced by getOrCreateSrcLocStr().

◆ getOrCreateIdent()

Constant * OpenMPIRBuilder::getOrCreateIdent ( Constant SrcLocStr,
uint32_t  SrcLocStrSize,
omp::IdentFlag  Flags = omp::IdentFlag(0),
unsigned  Reserve2Flags = 0 
)

◆ getOrCreateInternalVariable()

GlobalVariable * OpenMPIRBuilder::getOrCreateInternalVariable ( Type Ty,
const StringRef Name,
unsigned  AddressSpace = 0 
)

Gets (if variable with the given name already exist) or creates internal global variable with the specified Name.

The created variable has linkage CommonLinkage by default and is initialized by null value.

Parameters
TyType of the global variable. If it is exist already the type must be the same.
NameName of the variable.

Definition at line 7493 of file OMPIRBuilder.cpp.

References assert(), llvm::GlobalValue::CommonLinkage, DL, llvm::Module::getDataLayout(), llvm::Constant::getNullValue(), llvm::Module::getTargetTriple(), llvm::GlobalValue::InternalLinkage, InternalVars, M, Name, and llvm::GlobalValue::NotThreadLocal.

Referenced by createCachedThreadPrivate(), getAddrOfDeclareTargetVar(), and registerTargetGlobalVariable().

◆ getOrCreateRuntimeFunction()

FunctionCallee OpenMPIRBuilder::getOrCreateRuntimeFunction ( Module M,
omp::RuntimeFunction  FnID 
)

◆ getOrCreateRuntimeFunctionPtr()

Function * OpenMPIRBuilder::getOrCreateRuntimeFunctionPtr ( omp::RuntimeFunction  FnID)

◆ getOrCreateSrcLocStr() [1/4]

Constant * OpenMPIRBuilder::getOrCreateSrcLocStr ( const LocationDescription Loc,
uint32_t SrcLocStrSize 
)

◆ getOrCreateSrcLocStr() [2/4]

Constant * OpenMPIRBuilder::getOrCreateSrcLocStr ( DebugLoc  DL,
uint32_t SrcLocStrSize,
Function F = nullptr 
)

Return the (LLVM-IR) string describing the DebugLoc DL.

Use F as fallback if DL does not specify the function name.

Definition at line 918 of file OMPIRBuilder.cpp.

References DL, llvm::Function::empty(), F, llvm::Module::getName(), getOrCreateDefaultSrcLocStr(), getOrCreateSrcLocStr(), and M.

◆ getOrCreateSrcLocStr() [3/4]

Constant * OpenMPIRBuilder::getOrCreateSrcLocStr ( StringRef  FunctionName,
StringRef  FileName,
unsigned  Line,
unsigned  Column,
uint32_t SrcLocStrSize 
)

Return the (LLVM-IR) string describing the source location identified by the arguments.

Definition at line 894 of file OMPIRBuilder.cpp.

References llvm::SmallString< InternalLen >::append(), getOrCreateSrcLocStr(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), and llvm::SmallString< InternalLen >::str().

◆ getOrCreateSrcLocStr() [4/4]

Constant * OpenMPIRBuilder::getOrCreateSrcLocStr ( StringRef  LocStr,
uint32_t SrcLocStrSize 
)

◆ getOrCreateThreadID()

Value * OpenMPIRBuilder::getOrCreateThreadID ( Value Ident)

◆ getSizeInBytes()

Value * OpenMPIRBuilder::getSizeInBytes ( Value BasePtr)

◆ getTargetEntryUniqueInfo()

TargetRegionEntryInfo OpenMPIRBuilder::getTargetEntryUniqueInfo ( FileIdentifierInfoCallbackTy  CallBack,
StringRef  ParentName = "" 
)
static

Creates a unique info for a target entry when provided a filename and line number from.

Parameters
CallBackA callback function which should return filename the entry resides in as well as the line number for the target entry
ParentNameThe name of the parent the target entry resides in, if any.

Definition at line 9281 of file OMPIRBuilder.cpp.

References llvm::sys::fs::getUniqueID(), and llvm::report_fatal_error().

◆ initialize()

void OpenMPIRBuilder::initialize ( )

Initialize the internal state, this will put structures types and potentially other helpers into the underlying module.

Must be called before any other method and only once! This internal state includes types used in the OpenMPIRBuilder generated from OMPKinds.def.

Definition at line 649 of file OMPIRBuilder.cpp.

References M.

◆ isLastFinalizationInfoCancellable()

bool llvm::OpenMPIRBuilder::isLastFinalizationInfoCancellable ( omp::Directive  DK)
inline

Return true if the last entry in the finalization stack is of kind DK and cancellable.

Definition at line 2056 of file OMPIRBuilder.h.

References FinalizationStack.

Referenced by createBarrier(), and emitCancelationCheckImpl().

◆ loadOffloadInfoMetadata() [1/2]

void OpenMPIRBuilder::loadOffloadInfoMetadata ( Module M)

◆ loadOffloadInfoMetadata() [2/2]

void OpenMPIRBuilder::loadOffloadInfoMetadata ( StringRef  HostFilePath)

Loads all the offload entries information from the host IR metadata read from the file passed in as the HostFilePath argument.

This function is only meant to be used with device code generation.

Parameters
HostFilePathThe path to the host IR file, used to load in offload metadata for the device, allowing host and device to maintain the same metadata mapping.

Definition at line 9515 of file OMPIRBuilder.cpp.

References llvm::StringRef::empty(), llvm::expectedToErrorOrAndEmitErrors(), llvm::MemoryBuffer::getFile(), loadOffloadInfoMetadata(), M, llvm::parseBitcodeFile(), and llvm::report_fatal_error().

◆ popFinalizationCB()

void llvm::OpenMPIRBuilder::popFinalizationCB ( )
inline

Pop the last finalization callback from the finalization stack.

NOTE: Temporary solution until Clang CG is gone.

Definition at line 568 of file OMPIRBuilder.h.

References FinalizationStack.

◆ pushFinalizationCB()

void llvm::OpenMPIRBuilder::pushFinalizationCB ( const FinalizationInfo FI)
inline

Push a finalization callback on the finalization stack.

NOTE: Temporary solution until Clang CG is gone.

Definition at line 561 of file OMPIRBuilder.h.

References FinalizationStack.

◆ readTeamBoundsForKernel()

std::pair< int32_t, int32_t > OpenMPIRBuilder::readTeamBoundsForKernel ( const Triple T,
Function Kernel 
)
static

Read/write a bounds on teams for Kernel.

Read will return 0 if none is set.

Definition at line 6381 of file OMPIRBuilder.cpp.

References llvm::Function::getFnAttributeAsParsedInteger().

◆ readThreadBoundsForKernel()

std::pair< int32_t, int32_t > OpenMPIRBuilder::readThreadBoundsForKernel ( const Triple T,
Function Kernel 
)
static

}

Helpers to read/write kernel annotations from the IR.

{ Read/write a bounds on threads for Kernel. Read will return 0 if none is set.

Definition at line 6340 of file OMPIRBuilder.cpp.

References llvm::Function::getFnAttribute(), llvm::Function::getFnAttributeAsParsedInteger(), getNVPTXMDNode(), llvm::Attribute::getValueAsString(), and llvm::StringRef::split().

◆ registerTargetGlobalVariable()

void OpenMPIRBuilder::registerTargetGlobalVariable ( OffloadEntriesInfoManager::OMPTargetGlobalVarEntryKind  CaptureClause,
OffloadEntriesInfoManager::OMPTargetDeviceClauseKind  DeviceClause,
bool  IsDeclaration,
bool  IsExternallyVisible,
TargetRegionEntryInfo  EntryInfo,
StringRef  MangledName,
std::vector< GlobalVariable * > &  GeneratedRefs,
bool  OpenMPSIMD,
std::vector< Triple TargetTriple,
std::function< Constant *()>  GlobalInitializer,
std::function< GlobalValue::LinkageTypes()>  VariableLinkage,
Type LlvmPtrTy,
Constant Addr 
)

Registers a target variable for device or host.

Parameters
CaptureClause- enumerator corresponding to the OpenMP capture clause used in conjunction with the variable being registered (link, to, enter).
DeviceClause- enumerator corresponding to the OpenMP capture clause used in conjunction with the variable being registered (nohost, host, any)
IsDeclaration- boolean stating if the variable being registered is a declaration-only and not a definition
IsExternallyVisible- boolean stating if the variable is externally visible
EntryInfo- Unique entry information for the value generated using getTargetEntryUniqueInfo, used to name generated pointer references to the declare target variable
MangledName- the mangled name of the variable being registered
GeneratedRefs- references generated by invocations of registerTargetGlobalVariable these are required by Clang for book keeping.
OpenMPSIMD- if OpenMP SIMD mode is currently enabled
TargetTriple- The OpenMP device target triple we are compiling for
GlobalInitializer- a lambda function which creates a constant used for initializing a pointer reference to the variable in certain cases. If a nullptr is passed, it will default to utilising the original variable to initialize the pointer reference.
VariableLinkage- a lambda function which returns the variables linkage type, if unspecified and a nullptr is given, it will instead utilise the linkage stored on the existing global variable in the LLVMModule.
LlvmPtrTy- The type of the variable we are generating or retrieving an address for
Addr- the original llvm value (addr) of the variable to be registered

Definition at line 9388 of file OMPIRBuilder.cpp.

References Addr, Config, createPlatformSpecificName(), llvm::divideCeil(), llvm::omp::Flags, getAddrOfDeclareTargetVar(), llvm::Module::getDataLayout(), llvm::GlobalValue::getLinkage(), llvm::Module::getNamedValue(), getOrCreateInternalVariable(), llvm::DataLayout::getPointerSize(), llvm::DataLayout::getTypeSizeInBits(), llvm::GlobalValue::getValueType(), llvm::OffloadEntriesInfoManager::hasDeviceGlobalVarEntryInfo(), llvm::OpenMPIRBuilderConfig::hasRequiresUnifiedSharedMemory(), llvm::GlobalValue::InternalLinkage, llvm::OpenMPIRBuilderConfig::isTargetDevice(), llvm::GlobalValue::LinkOnceODRLinkage, M, OffloadInfoManager, llvm::OffloadEntriesInfoManager::OMPTargetDeviceClauseAny, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryEnter, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryLink, llvm::OffloadEntriesInfoManager::OMPTargetGlobalVarEntryTo, llvm::OffloadEntriesInfoManager::registerDeviceGlobalVarEntryInfo(), and llvm::GlobalValue::WeakAnyLinkage.

Referenced by getAddrOfDeclareTargetVar().

◆ registerTargetRegionFunction()

Constant * OpenMPIRBuilder::registerTargetRegionFunction ( TargetRegionEntryInfo EntryInfo,
Function OutlinedFunction,
StringRef  EntryFnName,
StringRef  EntryFnIDName 
)

Registers the given function and sets up the attribtues of the function Returns the FunctionID.

Parameters
InfoManagerThe info manager keeping track of the offload entries
EntryInfoThe entry information about the function
OutlinedFunctionPointer to the outlined function
EntryFnNameName of the outlined function
EntryFnIDNameName of the ID o be created

Definition at line 6466 of file OMPIRBuilder.cpp.

References OffloadInfoManager, llvm::OffloadEntriesInfoManager::OMPTargetRegionEntryTargetRegion, and llvm::OffloadEntriesInfoManager::registerTargetRegionEntryInfo().

Referenced by emitTargetRegionFunction().

◆ setConfig()

void llvm::OpenMPIRBuilder::setConfig ( OpenMPIRBuilderConfig  C)
inline

Definition at line 509 of file OMPIRBuilder.h.

References llvm::CallingConv::C, and Config.

◆ setCorrectMemberOfFlag()

void OpenMPIRBuilder::setCorrectMemberOfFlag ( omp::OpenMPOffloadMappingFlags Flags,
omp::OpenMPOffloadMappingFlags  MemberOfFlag 
)

Given an initial flag set, this function modifies it to contain the passed in MemberOfFlag generated from the getMemberOfFlag function.

The results are dependent on the existing flag bits set in the original flag set.

Parameters
Flags- The original set of flags to be modified with the passed in MemberOfFlag.
MemberOfFlag- A modified OMP_MAP_MEMBER_OF flag, adjusted slightly based on the getMemberOfFlag which adjusts the flag bits based on the members position in its parent.

Definition at line 9313 of file OMPIRBuilder.cpp.

References llvm::omp::Flags, llvm::omp::OMP_MAP_MEMBER_OF, and llvm::omp::OMP_MAP_PTR_AND_OBJ.

◆ tileLoops()

std::vector< CanonicalLoopInfo * > OpenMPIRBuilder::tileLoops ( DebugLoc  DL,
ArrayRef< CanonicalLoopInfo * >  Loops,
ArrayRef< Value * >  TileSizes 
)

Tile a loop nest.

Tiles the loops of Loops by the tile sizes in TileSizes. Loops in / Loops must be perfectly nested, from outermost to innermost loop (i.e. Loops.front() is the outermost loop). The trip count llvm::Value of every loop and every tile sizes must be usable in the outermost loop's preheader. This implies that the loop nest is rectangular.

Example:

for (int i = 0; i < 15; ++i) // Canonical loop "i"
for (int j = 0; j < 14; ++j) // Canonical loop "j"
body(i, j);

After tiling with Loops={i,j} and TileSizes={5,7}, the loop is changed to

for (int i1 = 0; i1 < 3; ++i1)
for (int j1 = 0; j1 < 2; ++j1)
for (int i2 = 0; i2 < 5; ++i2)
for (int j2 = 0; j2 < 7; ++j2)
body(i1*3+i2, j1*3+j2);

The returned vector are the loops {i1,j1,i2,j2}. The loops i1 and j1 are referred to the floor, and the loops i2 and j2 are the tiles. Tiling also handles non-constant trip counts, non-constant tile sizes and trip counts that are not multiples of the tile size. In the latter case the tile loop of the last floor-loop iteration will have fewer iterations than specified as its tile size.

Parameters
DLDebug location for instructions added by tiling, for instance the floor- and tile trip count computation.
LoopsLoops to tile. The CanonicalLoopInfo objects are invalidated by this method, i.e. should not used after tiling.
TileSizesFor each loop in Loops, the tile size for that dimensions.
Returns
A list of generated loops. Contains twice as many loops as the input loop nest; the first half are the floor loops and the second half are the tile loops.

Definition at line 4950 of file OMPIRBuilder.cpp.

References assert(), Builder, llvm::Continue, llvm::IRBuilderBase::CreateAdd(), llvm::IRBuilderBase::CreateICmpEQ(), llvm::IRBuilderBase::CreateICmpNE(), createLoopSkeleton(), llvm::IRBuilderBase::CreateMul(), llvm::IRBuilderBase::CreateSelect(), llvm::IRBuilderBase::CreateUDiv(), llvm::IRBuilderBase::CreateURem(), llvm::IRBuilderBase::CreateZExt(), DL, llvm::SmallVectorImpl< T >::emplace_back(), llvm::enumerate(), F, llvm::CanonicalLoopInfo::getAfter(), llvm::CanonicalLoopInfo::getBody(), llvm::CanonicalLoopInfo::getExit(), llvm::CanonicalLoopInfo::getIndVar(), llvm::CanonicalLoopInfo::getLatch(), llvm::BasicBlock::getParent(), llvm::CanonicalLoopInfo::getPreheader(), llvm::CanonicalLoopInfo::getPreheaderIP(), llvm::BasicBlock::getTerminator(), llvm::Value::getType(), Loops, Name, P, llvm::SmallVectorTemplateBase< T, bool >::push_back(), redirectAllPredecessorsTo(), redirectTo(), removeUnusedBlocksFromParent(), llvm::Value::replaceAllUsesWith(), llvm::SmallVectorImpl< T >::reserve(), llvm::IRBuilderBase::restoreIP(), llvm::IRBuilderBase::SetCurrentDebugLocation(), llvm::IRBuilderBase::SetInsertPoint(), llvm::ArrayRef< T >::size(), Size, and TileSize.

Referenced by unrollLoopPartial().

◆ unrollLoopFull()

void OpenMPIRBuilder::unrollLoopFull ( DebugLoc  DL,
CanonicalLoopInfo Loop 
)

Fully unroll a loop.

Instead of unrolling the loop immediately (and duplicating its body instructions), it is deferred to LLVM's LoopUnrollPass by adding loop metadata.

Parameters
DLDebug location for instructions added by unrolling.
LoopThe loop to unroll. The loop will be invalidated.

Definition at line 5189 of file OMPIRBuilder.cpp.

References addLoopMetadata(), Builder, llvm::MDNode::get(), llvm::MDString::get(), and llvm::IRBuilderBase::getContext().

◆ unrollLoopHeuristic()

void OpenMPIRBuilder::unrollLoopHeuristic ( DebugLoc  DL,
CanonicalLoopInfo Loop 
)

Fully or partially unroll a loop.

How the loop is unrolled is determined using LLVM's LoopUnrollPass.

Parameters
DLDebug location for instructions added by unrolling.
LoopThe loop to unroll. The loop will be invalidated.

Definition at line 5196 of file OMPIRBuilder.cpp.

References addLoopMetadata(), Builder, llvm::MDNode::get(), llvm::MDString::get(), and llvm::IRBuilderBase::getContext().

◆ unrollLoopPartial()

void OpenMPIRBuilder::unrollLoopPartial ( DebugLoc  DL,
CanonicalLoopInfo Loop,
int32_t  Factor,
CanonicalLoopInfo **  UnrolledCLI 
)

Partially unroll a loop.

The CanonicalLoopInfo of the unrolled loop for use with chained loop-associated directive can be requested using UnrolledCLI. Not needing the CanonicalLoopInfo allows more efficient code generation by deferring the actual unrolling to the LoopUnrollPass using loop metadata. A loop-associated directive applied to the unrolled loop needs to know the new trip count which means that if using a heuristically determined unroll factor (Factor == 0), that factor must be computed immediately. We are using the same logic as the LoopUnrollPass to derived the unroll factor, but which assumes that some canonicalization has taken place (e.g. Mem2Reg, LICM, GVN, Inlining, etc.). That is, the heuristic will perform better when the unrolled loop's CanonicalLoopInfo is not needed.

Parameters
DLDebug location for instructions added by unrolling.
LoopThe loop to unroll. The loop will be invalidated.
FactorThe factor to unroll the loop by. A factor of 0 indicates that a heuristic should be used to determine the unroll-factor.
UnrolledCLIIf non-null, receives the CanonicalLoopInfo of the partially unrolled loop. Otherwise, uses loop metadata to defer unrolling to the LoopUnrollPass.

Definition at line 5551 of file OMPIRBuilder.cpp.

References addLoopMetadata(), assert(), computeHeuristicUnrollFactor(), DL, F, llvm::ConstantAsMetadata::get(), llvm::MDNode::get(), llvm::MDString::get(), llvm::Type::getInt32Ty(), llvm::Type::getIntegerBitWidth(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), and tileLoops().

◆ updateToLocation()

bool llvm::OpenMPIRBuilder::updateToLocation ( const LocationDescription Loc)
inline

◆ writeTeamsForKernel()

void OpenMPIRBuilder::writeTeamsForKernel ( const Triple T,
Function Kernel,
int32_t  LB,
int32_t  UB 
)
static

Definition at line 6386 of file OMPIRBuilder.cpp.

References llvm::Function::addFnAttr(), and updateNVPTXMetadata().

Referenced by createTargetInit().

◆ writeThreadBoundsForKernel()

void OpenMPIRBuilder::writeThreadBoundsForKernel ( const Triple T,
Function Kernel,
int32_t  LB,
int32_t  UB 
)
static

Definition at line 6366 of file OMPIRBuilder.cpp.

References llvm::Function::addFnAttr(), and updateNVPTXMetadata().

Referenced by createTargetInit().

Member Data Documentation

◆ Builder

IRBuilder llvm::OpenMPIRBuilder::Builder

The LLVM-IR Builder used to create IR.

Definition at line 2084 of file OMPIRBuilder.h.

Referenced by applySimd(), collapseLoops(), createAtomicCapture(), createAtomicCompare(), createAtomicRead(), createAtomicUpdate(), createAtomicWrite(), createBarrier(), createCachedThreadPrivate(), createCancel(), createCanonicalLoop(), createCopyinClauseBlocks(), createCopyPrivate(), createCritical(), createLoopSkeleton(), createMapperAllocas(), createMasked(), createMaster(), createOffloadEntriesAndInfoMetadata(), createOMPAlloc(), createOMPFree(), createOMPInteropDestroy(), createOMPInteropInit(), createOMPInteropUse(), createOrderedDepend(), createOrderedThreadsSimd(), createParallel(), createReductions(), createReductionsGPU(), createSection(), createSections(), createSingle(), createTarget(), createTargetData(), createTargetDeinit(), createTargetInit(), createTargetLoopWorkshareCall(), createTask(), createTaskgroup(), createTeams(), emitBlock(), emitBranch(), emitCancelationCheckImpl(), emitFlush(), emitIfClause(), emitKernelLaunch(), emitMapperCall(), emitNonContiguousDescriptor(), emitOffloadingArrays(), emitOffloadingArraysAndArgs(), emitOffloadingArraysArgument(), emitTargetKernel(), emitTargetTask(), emitTaskDependencies(), emitTaskwaitImpl(), emitTaskyieldImpl(), emitUserDefinedMapper(), finalize(), getInsertionPoint(), getKernelArgsVector(), getOrCreateSrcLocStr(), getOrCreateThreadID(), getSizeInBytes(), hostParallelCallback(), targetParallelCallback(), tileLoops(), unrollLoopFull(), unrollLoopHeuristic(), updateToLocation(), and workshareLoopTargetCallback().

◆ Config

OpenMPIRBuilderConfig llvm::OpenMPIRBuilder::Config

◆ ConstantAllocaRaiseCandidates

SmallVector<llvm::Function *, 16> llvm::OpenMPIRBuilder::ConstantAllocaRaiseCandidates

A collection of candidate target functions that's constant allocas will attempt to be raised on a call of finalize after all currently enqueued outline info's have been processed.

Definition at line 2121 of file OMPIRBuilder.h.

Referenced by createOutlinedFunction(), and finalize().

◆ FinalizationStack

SmallVector<FinalizationInfo, 8> llvm::OpenMPIRBuilder::FinalizationStack

The finalization stack made up of finalize callbacks currently in-flight, wrapped into FinalizationInfo objects that reference also the finalization target block and the kind of cancellable directive.

Definition at line 2052 of file OMPIRBuilder.h.

Referenced by createParallel(), createSections(), emitCancelationCheckImpl(), isLastFinalizationInfoCancellable(), popFinalizationCB(), and pushFinalizationCB().

◆ IdentMap

DenseMap<std::pair<Constant *, uint64_t>, Constant *> llvm::OpenMPIRBuilder::IdentMap

Map to remember existing ident_t*.

Definition at line 2090 of file OMPIRBuilder.h.

Referenced by getOrCreateIdent().

◆ InternalVars

StringMap<GlobalVariable *, BumpPtrAllocator> llvm::OpenMPIRBuilder::InternalVars

An ordered map of auto-generated variables to their unique names.

It stores variables with the following names: 1) ".gomp_critical_user_" + <critical_section_name> + ".var" for "omp critical" directives; 2) <mangled_name_for_global_var> + ".cache." for cache for threadprivate variables.

Definition at line 2135 of file OMPIRBuilder.h.

Referenced by getOrCreateInternalVariable().

◆ LoopInfos

std::forward_list<CanonicalLoopInfo> llvm::OpenMPIRBuilder::LoopInfos

Collection of owned canonical loop objects that eventually need to be free'd.

Definition at line 2125 of file OMPIRBuilder.h.

Referenced by createLoopSkeleton().

◆ M

Module& llvm::OpenMPIRBuilder::M

◆ OffloadInfoManager

OffloadEntriesInfoManager llvm::OpenMPIRBuilder::OffloadInfoManager

◆ ompOffloadInfoName

const std::string llvm::OpenMPIRBuilder::ompOffloadInfoName = "omp_offload.info"

OMP Offload Info Metadata name string.

Definition at line 3342 of file OMPIRBuilder.h.

Referenced by loadOffloadInfoMetadata().

◆ OutlineInfos

SmallVector<OutlineInfo, 16> llvm::OpenMPIRBuilder::OutlineInfos

Collection of regions that need to be outlined during finalization.

Definition at line 2116 of file OMPIRBuilder.h.

Referenced by addOutlineInfo(), finalize(), and ~OpenMPIRBuilder().

◆ SrcLocStrMap

StringMap<Constant *> llvm::OpenMPIRBuilder::SrcLocStrMap

Map to remember source location strings.

Definition at line 2087 of file OMPIRBuilder.h.

Referenced by getOrCreateSrcLocStr().

◆ T

const Triple llvm::OpenMPIRBuilder::T

The target triple of the underlying module.

Definition at line 2096 of file OMPIRBuilder.h.


The documentation for this class was generated from the following files: