LLVM 20.0.0git
|
LoopVectorizationCostModel - estimates the expected speedups due to vectorization. More...
Classes | |
struct | CallWideningDecision |
struct | RegisterUsage |
A struct that represents some properties of the register usage of a loop. More... | |
Public Types | |
enum | InstWidening { CM_Unknown , CM_Widen , CM_Widen_Reverse , CM_Interleave , CM_GatherScatter , CM_Scalarize , CM_VectorCall , CM_IntrinsicCall } |
Decision that was taken during cost calculation for memory instruction. More... | |
Public Member Functions | |
LoopVectorizationCostModel (ScalarEpilogueLowering SEL, Loop *L, PredicatedScalarEvolution &PSE, LoopInfo *LI, LoopVectorizationLegality *Legal, const TargetTransformInfo &TTI, const TargetLibraryInfo *TLI, DemandedBits *DB, AssumptionCache *AC, OptimizationRemarkEmitter *ORE, const Function *F, const LoopVectorizeHints *Hints, InterleavedAccessInfo &IAI) | |
FixedScalableVFPair | computeMaxVF (ElementCount UserVF, unsigned UserIC) |
bool | runtimeChecksRequired () |
bool | selectUserVectorizationFactor (ElementCount UserVF) |
Setup cost-based decisions for user vectorization factor. | |
std::pair< unsigned, unsigned > | getSmallestAndWidestTypes () |
unsigned | selectInterleaveCount (ElementCount VF, InstructionCost LoopCost) |
void | setCostBasedWideningDecision (ElementCount VF) |
Memory access instruction may be vectorized in more than one way. | |
void | setVectorizedCallDecision (ElementCount VF) |
A call may be vectorized in different ways depending on whether we have vectorized variants available and whether the target supports masking. | |
SmallVector< RegisterUsage, 8 > | calculateRegisterUsage (ArrayRef< ElementCount > VFs) |
void | collectValuesToIgnore () |
Collect values we want to ignore in the cost model. | |
void | collectElementTypesForWidening () |
Collect all element types in the loop for which widening is needed. | |
void | collectInLoopReductions () |
Split reductions into those that happen in the loop, and those that happen outside. | |
bool | useOrderedReductions (const RecurrenceDescriptor &RdxDesc) const |
Returns true if we should use strict in-order reductions for the given RdxDesc. | |
const MapVector< Instruction *, uint64_t > & | getMinimalBitwidths () const |
bool | isProfitableToScalarize (Instruction *I, ElementCount VF) const |
bool | isUniformAfterVectorization (Instruction *I, ElementCount VF) const |
Returns true if I is known to be uniform after vectorization. | |
bool | isScalarAfterVectorization (Instruction *I, ElementCount VF) const |
Returns true if I is known to be scalar after vectorization. | |
bool | canTruncateToMinimalBitwidth (Instruction *I, ElementCount VF) const |
void | setWideningDecision (Instruction *I, ElementCount VF, InstWidening W, InstructionCost Cost) |
Save vectorization decision W and Cost taken by the cost model for instruction I and vector width VF . | |
void | setWideningDecision (const InterleaveGroup< Instruction > *Grp, ElementCount VF, InstWidening W, InstructionCost Cost) |
Save vectorization decision W and Cost taken by the cost model for interleaving group Grp and vector width VF . | |
InstWidening | getWideningDecision (Instruction *I, ElementCount VF) const |
Return the cost model decision for the given instruction I and vector width VF . | |
InstructionCost | getWideningCost (Instruction *I, ElementCount VF) |
Return the vectorization cost for the given instruction I and vector width VF . | |
void | setCallWideningDecision (CallInst *CI, ElementCount VF, InstWidening Kind, Function *Variant, Intrinsic::ID IID, std::optional< unsigned > MaskPos, InstructionCost Cost) |
CallWideningDecision | getCallWideningDecision (CallInst *CI, ElementCount VF) const |
bool | isOptimizableIVTruncate (Instruction *I, ElementCount VF) |
Return True if instruction I is an optimizable truncate whose operand is an induction variable. | |
void | collectInstsToScalarize (ElementCount VF) |
Collects the instructions to scalarize for each predicated instruction in the loop. | |
void | collectUniformsAndScalars (ElementCount VF) |
Collect Uniform and Scalar values for the given VF . | |
bool | isLegalMaskedStore (Type *DataType, Value *Ptr, Align Alignment) const |
Returns true if the target machine supports masked store operation for the given DataType and kind of access to Ptr . | |
bool | isLegalMaskedLoad (Type *DataType, Value *Ptr, Align Alignment) const |
Returns true if the target machine supports masked load operation for the given DataType and kind of access to Ptr . | |
bool | isLegalGatherOrScatter (Value *V, ElementCount VF) |
Returns true if the target machine can represent V as a masked gather or scatter operation. | |
bool | canVectorizeReductions (ElementCount VF) const |
Returns true if the target machine supports all of the reduction variables found for the given VF. | |
bool | isDivRemScalarWithPredication (InstructionCost ScalarCost, InstructionCost SafeDivisorCost) const |
Given costs for both strategies, return true if the scalar predication lowering should be used for div/rem. | |
bool | isScalarWithPredication (Instruction *I, ElementCount VF) const |
Returns true if I is an instruction which requires predication and for which our chosen predication strategy is scalarization (i.e. | |
bool | isPredicatedInst (Instruction *I) const |
Returns true if I is an instruction that needs to be predicated at runtime. | |
std::pair< InstructionCost, InstructionCost > | getDivRemSpeculationCost (Instruction *I, ElementCount VF) const |
Return the costs for our two available strategies for lowering a div/rem operation which requires speculating at least one lane. | |
bool | memoryInstructionCanBeWidened (Instruction *I, ElementCount VF) |
Returns true if I is a memory instruction with consecutive memory access that can be widened. | |
bool | interleavedAccessCanBeWidened (Instruction *I, ElementCount VF) const |
Returns true if I is a memory instruction in an interleaved-group of memory accesses that can be vectorized with wide vector loads/stores and shuffles. | |
bool | isAccessInterleaved (Instruction *Instr) const |
Check if Instr belongs to any interleaved access group. | |
const InterleaveGroup< Instruction > * | getInterleavedAccessGroup (Instruction *Instr) const |
Get the interleaved access group that Instr belongs to. | |
bool | requiresScalarEpilogue (bool IsVectorizing) const |
Returns true if we're required to use a scalar epilogue for at least the final iteration of the original loop. | |
bool | requiresScalarEpilogue (VFRange Range) const |
Returns true if we're required to use a scalar epilogue for at least the final iteration of the original loop for all VFs in Range . | |
bool | isScalarEpilogueAllowed () const |
Returns true if a scalar epilogue is not allowed due to optsize or a loop hint annotation. | |
TailFoldingStyle | getTailFoldingStyle (bool IVUpdateMayOverflow=true) const |
Returns the TailFoldingStyle that is best for the current loop. | |
void | setTailFoldingStyles (bool IsScalableVF, unsigned UserIC) |
Selects and saves TailFoldingStyle for 2 options - if IV update may overflow or not. | |
bool | foldTailByMasking () const |
Returns true if all loop blocks should be masked to fold tail loop. | |
std::optional< unsigned > | getMaxSafeElements () const |
Return maximum safe number of elements to be processed per vector iteration, which do not prevent store-load forwarding and are safe with regard to the memory dependencies. | |
bool | blockNeedsPredicationForAnyReason (BasicBlock *BB) const |
Returns true if the instructions in this block requires predication for any reason, e.g. | |
bool | foldTailWithEVL () const |
Returns true if VP intrinsics with explicit vector length support should be generated in the tail folded loop. | |
bool | isInLoopReduction (PHINode *Phi) const |
Returns true if the Phi is part of an inloop reduction. | |
bool | usePredicatedReductionSelect (unsigned Opcode, Type *PhiTy) const |
Returns true if the predicated reduction select should be used to set the incoming value for the reduction phi. | |
InstructionCost | getVectorIntrinsicCost (CallInst *CI, ElementCount VF) const |
Estimate cost of an intrinsic call instruction CI if it were vectorized with factor VF. | |
InstructionCost | getVectorCallCost (CallInst *CI, ElementCount VF) const |
Estimate cost of a call instruction CI if it were vectorized with factor VF. | |
void | invalidateCostModelingDecisions () |
Invalidates decisions already taken by the cost model. | |
InstructionCost | expectedCost (ElementCount VF) |
Returns the expected execution cost. | |
bool | hasPredStores () const |
bool | isEpilogueVectorizationProfitable (const ElementCount VF, const unsigned IC) const |
Returns true if epilogue vectorization is considered profitable, and false otherwise. | |
InstructionCost | getInstructionCost (Instruction *I, ElementCount VF) |
Returns the execution time cost of an instruction for a given vector width. | |
std::optional< InstructionCost > | getReductionPatternCost (Instruction *I, ElementCount VF, Type *VectorTy, TTI::TargetCostKind CostKind) const |
Return the cost of instructions in an inloop reduction pattern, if I is part of that pattern. | |
bool | shouldConsiderInvariant (Value *Op) |
Returns true if Op should be considered invariant and if it is trivially hoistable. | |
Public Attributes | |
Loop * | TheLoop |
The loop that we evaluate. | |
PredicatedScalarEvolution & | PSE |
Predicated scalar evolution analysis. | |
LoopInfo * | LI |
Loop Info analysis. | |
LoopVectorizationLegality * | Legal |
Vectorization legality. | |
const TargetTransformInfo & | TTI |
Vector target information. | |
const TargetLibraryInfo * | TLI |
Target Library Info. | |
DemandedBits * | DB |
Demanded bits analysis. | |
AssumptionCache * | AC |
Assumption cache. | |
OptimizationRemarkEmitter * | ORE |
Interface to emit optimization remarks. | |
const Function * | TheFunction |
const LoopVectorizeHints * | Hints |
Loop Vectorize Hint. | |
InterleavedAccessInfo & | InterleaveInfo |
The interleave access information contains groups of interleaved accesses with the same stride and close to each other. | |
SmallPtrSet< const Value *, 16 > | ValuesToIgnore |
Values to ignore in the cost model. | |
SmallPtrSet< const Value *, 16 > | VecValuesToIgnore |
Values to ignore in the cost model when VF > 1. | |
SmallPtrSet< Type *, 16 > | ElementTypesInLoop |
All element types found in the loop. | |
Friends | |
class | LoopVectorizationPlanner |
LoopVectorizationCostModel - estimates the expected speedups due to vectorization.
In many cases vectorization is not profitable. This can happen because of a number of reasons. In this class we mainly attempt to predict the expected speedup/slowdowns due to the supported instruction set. We use the TargetTransformInfo to query the different backends for the cost of different operations.
Definition at line 988 of file LoopVectorize.cpp.
Decision that was taken during cost calculation for memory instruction.
Enumerator | |
---|---|
CM_Unknown | |
CM_Widen | |
CM_Widen_Reverse | |
CM_Interleave | |
CM_GatherScatter | |
CM_Scalarize | |
CM_VectorCall | |
CM_IntrinsicCall |
Definition at line 1148 of file LoopVectorize.cpp.
|
inline |
Definition at line 992 of file LoopVectorize.cpp.
|
inline |
Returns true if the instructions in this block requires predication for any reason, e.g.
because tail folding now requires a predicate or because the block in the original loop was predicated.
Definition at line 1502 of file LoopVectorize.cpp.
References llvm::LoopVectorizationLegality::blockNeedsPredication(), foldTailByMasking(), and Legal.
Referenced by collectInstsToScalarize(), interleavedAccessCanBeWidened(), isPredicatedInst(), and llvm::LoopVectorizationPlanner::plan().
SmallVector< LoopVectorizationCostModel::RegisterUsage, 8 > LoopVectorizationCostModel::calculateRegisterUsage | ( | ArrayRef< ElementCount > | VFs | ) |
Definition at line 5305 of file LoopVectorize.cpp.
References llvm::all_of(), llvm::LoopBlocksDFS::beginRPO(), collectInLoopReductions(), collectUniformsAndScalars(), llvm::LoopBase< BlockT, LoopT >::contains(), llvm::SmallPtrSetImpl< PtrType >::count(), llvm::dbgs(), End, llvm::LoopBlocksDFS::endRPO(), llvm::SmallPtrSetImpl< PtrType >::erase(), llvm::VectorType::get(), llvm::ElementCount::getFixed(), llvm::TargetTransformInfo::getRegisterClassForType(), llvm::TargetTransformInfo::getRegisterClassName(), llvm::TargetTransformInfo::getRegUsageForType(), I, Idx, llvm::SetVector< T, Vector, Set, N >::insert(), llvm::SmallPtrSetImpl< PtrType >::insert(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), isScalarAfterVectorization(), llvm::Type::isTokenTy(), llvm::VectorType::isValidElementType(), llvm::ElementCount::isVector(), LI, llvm::List, LLVM_DEBUG, llvm::LoopVectorizationCostModel::RegisterUsage::LoopInvariantRegs, llvm::make_range(), llvm::LoopVectorizationCostModel::RegisterUsage::MaxLocalUsers, llvm::LoopBlocksDFS::perform(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), llvm::RegUsage, llvm::ArrayRef< T >::size(), llvm::MapVector< KeyT, ValueT, MapType, VectorType >::size(), llvm::SmallPtrSetImplBase::size(), llvm::SmallVectorBase< Size_T >::size(), TheLoop, ToRemove, TTI, ValuesToIgnore, and VecValuesToIgnore.
Referenced by selectInterleaveCount().
|
inline |
I
can be truncated to a smaller bitwidth for vectorization factor VF
. Definition at line 1141 of file LoopVectorize.cpp.
References I, isProfitableToScalarize(), isScalarAfterVectorization(), and llvm::ElementCount::isVector().
Referenced by getInstructionCost().
|
inline |
Returns true if the target machine supports all of the reduction variables found for the given VF.
Definition at line 1321 of file LoopVectorize.cpp.
References llvm::all_of(), llvm::LoopVectorizationLegality::getReductionVars(), Legal, and Reduction.
void LoopVectorizationCostModel::collectElementTypesForWidening | ( | ) |
Collect all element types in the loop for which widening is needed.
Definition at line 4977 of file LoopVectorize.cpp.
References assert(), llvm::LoopBase< BlockT, LoopT >::blocks(), ElementTypesInLoop, llvm::MapVector< KeyT, ValueT, MapType, VectorType >::find(), llvm::RecurrenceDescriptor::getOpcode(), llvm::RecurrenceDescriptor::getRecurrenceType(), llvm::LoopVectorizationLegality::getReductionVars(), I, llvm::LoopVectorizationLegality::isReductionVariable(), Legal, llvm::TargetTransformInfo::preferInLoopReduction(), PreferInLoopReductions, TheLoop, useOrderedReductions(), and ValuesToIgnore.
Referenced by llvm::LoopVectorizationPlanner::plan(), and processLoopInVPlanNativePath().
void LoopVectorizationCostModel::collectInLoopReductions | ( | ) |
Split reductions into those that happen in the loop, and those that happen outside.
In loop reductions are collected into InLoopReductions.
Definition at line 7158 of file LoopVectorize.cpp.
References llvm::dbgs(), llvm::SmallVectorBase< Size_T >::empty(), llvm::RecurrenceDescriptor::getOpcode(), llvm::RecurrenceDescriptor::getRecurrenceType(), llvm::RecurrenceDescriptor::getReductionOpChain(), llvm::LoopVectorizationLegality::getReductionVars(), I, llvm::SmallPtrSetImpl< PtrType >::insert(), Legal, LLVM_DEBUG, llvm::TargetTransformInfo::preferInLoopReduction(), PreferInLoopReductions, Reduction, TheLoop, and useOrderedReductions().
Referenced by calculateRegisterUsage(), and llvm::LoopVectorizationPlanner::plan().
void LoopVectorizationCostModel::collectInstsToScalarize | ( | ElementCount | VF | ) |
Collects the instructions to scalarize for each predicated instruction in the loop.
Definition at line 5527 of file LoopVectorize.cpp.
References _, llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::begin(), blockNeedsPredicationForAnyReason(), llvm::LoopBase< BlockT, LoopT >::blocks(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::clear(), CM_Scalarize, llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::contains(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::end(), I, llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::insert(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), llvm::ElementCount::isScalar(), isScalarAfterVectorization(), isScalarWithPredication(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isZero(), llvm::predecessors(), and TheLoop.
Referenced by llvm::LoopVectorizationPlanner::plan(), and selectUserVectorizationFactor().
|
inline |
Collect Uniform and Scalar values for the given VF
.
The sets depend on CM decision for Load/Store instructions that may be vectorized as interleave, gather-scatter or scalarized. Also make a decision on what to do about call instructions in the loop at that VF – scalarize, call a known vector routine, or call a vector intrinsic.
Definition at line 1280 of file LoopVectorize.cpp.
References llvm::ElementCount::isScalar(), setCostBasedWideningDecision(), and setVectorizedCallDecision().
Referenced by calculateRegisterUsage(), llvm::LoopVectorizationPlanner::plan(), and selectUserVectorizationFactor().
void LoopVectorizationCostModel::collectValuesToIgnore | ( | ) |
Collect values we want to ignore in the cost model.
Definition at line 7001 of file LoopVectorize.cpp.
References _, AC, llvm::all_of(), llvm::any_of(), llvm::SmallVectorImpl< T >::append(), llvm::SmallVectorTemplateCommon< T, typename >::begin(), llvm::SmallPtrSetImpl< PtrType >::begin(), llvm::LoopBlocksDFS::beginRPO(), llvm::CodeMetrics::collectEphemeralValues(), llvm::LoopBase< BlockT, LoopT >::contains(), llvm::SmallVectorTemplateCommon< T, typename >::end(), llvm::SmallPtrSetImpl< PtrType >::end(), llvm::LoopBlocksDFS::endRPO(), llvm::RecurrenceDescriptor::getCastInsts(), llvm::InductionDescriptor::getCastInsts(), llvm::LoopBase< BlockT, LoopT >::getHeader(), llvm::LoopVectorizationLegality::getInductionVars(), getInterleavedAccessGroup(), llvm::getLoadStorePointerOperand(), llvm::LoopVectorizationLegality::getReductionVars(), llvm::BasicBlock::getSingleSuccessor(), I, isAccessInterleaved(), IsEmptyBlock(), llvm::LoopVectorizationLegality::isInvariantAddressOfReduction(), Legal, LI, llvm::make_range(), llvm::LoopBlocksDFS::perform(), llvm::BasicBlock::phis(), llvm::SmallVectorTemplateBase< T, bool >::push_back(), Reduction, requiresScalarEpilogue(), llvm::reverse(), llvm::SmallVectorBase< Size_T >::size(), TheLoop, TLI, ValuesToIgnore, VecValuesToIgnore, and llvm::wouldInstructionBeTriviallyDead().
Referenced by llvm::LoopVectorizationPlanner::plan().
FixedScalableVFPair LoopVectorizationCostModel::computeMaxVF | ( | ElementCount | UserVF, |
unsigned | UserIC | ||
) |
Definition at line 4117 of file LoopVectorize.cpp.
References llvm::ScalarEvolution::applyLoopGuards(), assert(), llvm::CM_ScalarEpilogueAllowed, llvm::CM_ScalarEpilogueNotAllowedLowTripLoop, llvm::CM_ScalarEpilogueNotAllowedOptSize, llvm::CM_ScalarEpilogueNotAllowedUsePredicate, llvm::CM_ScalarEpilogueNotNeededUsePredicate, llvm::DataWithEVL, llvm::dbgs(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::empty(), llvm::FixedScalableVFPair::FixedVF, foldTailByMasking(), llvm::ScalarEvolution::getAddExpr(), llvm::PredicatedScalarEvolution::getBackedgeTakenCount(), llvm::ScalarEvolution::getConstant(), llvm::LoopBase< BlockT, LoopT >::getExitingBlock(), llvm::ElementCount::getFixed(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getFixedValue(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), llvm::LoopBase< BlockT, LoopT >::getLoopLatch(), getMaxVScale(), llvm::FixedScalableVFPair::getNone(), llvm::ScalarEvolution::getOne(), llvm::LoopVectorizationLegality::getRuntimePointerChecking(), llvm::PredicatedScalarEvolution::getSE(), llvm::PredicatedScalarEvolution::getSmallConstantMaxTripCount(), llvm::ScalarEvolution::getSmallConstantTripCount(), llvm::PredicatedScalarEvolution::getSymbolicMaxBackedgeTakenCount(), getTailFoldingStyle(), llvm::SCEV::getType(), llvm::ScalarEvolution::getURemExpr(), llvm::TargetTransformInfo::hasBranchDivergence(), InterleaveInfo, llvm::InterleavedAccessInfo::invalidateGroupsRequiringScalarEpilogue(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isNonZero(), llvm::isPowerOf2_32(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), llvm::TargetTransformInfo::isVScaleKnownToBeAPowerOfTwo(), llvm::SCEV::isZero(), Legal, LLVM_DEBUG, llvm::RuntimePointerChecking::Need, ORE, PSE, llvm::reportVectorizationFailure(), runtimeChecksRequired(), llvm::FixedScalableVFPair::ScalableVF, setTailFoldingStyles(), TheFunction, TheLoop, and useMaskedInterleavedAccesses().
Referenced by llvm::LoopVectorizationPlanner::plan().
InstructionCost LoopVectorizationCostModel::expectedCost | ( | ElementCount | VF | ) |
Returns the expected execution cost.
The unit of the cost does not matter because we use the 'cost' units to compare different vector widths. The cost that is returned is not normalized by the factor width.
Definition at line 5693 of file LoopVectorize.cpp.
References addFullyUnrolledInstructionsToIgnore(), llvm::LoopVectorizationLegality::blockNeedsPredication(), llvm::LoopBase< BlockT, LoopT >::blocks(), llvm::CallingConv::C, llvm::SmallPtrSetImpl< PtrType >::count(), llvm::dbgs(), foldTailByMasking(), ForceTargetInstructionCost, llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getFixedValue(), llvm::LoopVectorizationLegality::getInductionVars(), getInstructionCost(), llvm::getReciprocalPredBlockProb(), llvm::PredicatedScalarEvolution::getSE(), llvm::ScalarEvolution::getSmallConstantTripCount(), I, llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isFixed(), llvm::ElementCount::isScalar(), llvm::ElementCount::isVector(), Legal, LLVM_DEBUG, PSE, TheLoop, ValuesToIgnore, and VecValuesToIgnore.
Referenced by llvm::LoopVectorizationPlanner::computeBestVF(), selectInterleaveCount(), and selectUserVectorizationFactor().
|
inline |
Returns true if all loop blocks should be masked to fold tail loop.
Definition at line 1484 of file LoopVectorize.cpp.
References getTailFoldingStyle(), and llvm::None.
Referenced by blockNeedsPredicationForAnyReason(), computeMaxVF(), llvm::VPRecipeBuilder::createHeaderMask(), expectedCost(), llvm::LoopVectorizationPlanner::plan(), and setCostBasedWideningDecision().
|
inline |
Returns true if VP intrinsics with explicit vector length support should be generated in the tail folded loop.
Definition at line 1508 of file LoopVectorize.cpp.
References llvm::DataWithEVL, and getTailFoldingStyle().
Referenced by getInstructionCost(), selectInterleaveCount(), and usePredicatedReductionSelect().
|
inline |
Definition at line 1238 of file LoopVectorize.cpp.
References assert(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::at(), and llvm::ElementCount::isScalar().
std::pair< InstructionCost, InstructionCost > LoopVectorizationCostModel::getDivRemSpeculationCost | ( | Instruction * | I, |
ElementCount | VF | ||
) | const |
Return the costs for our two available strategies for lowering a div/rem operation which requires speculating at least one lane.
First result is for scalarization (will be invalid for scalable vectors); second is for the safe-divisor strategy.
Definition at line 3498 of file LoopVectorize.cpp.
References assert(), llvm::CmpInst::BAD_ICMP_PREDICATE, CostKind, llvm::TargetTransformInfo::getArithmeticInstrCost(), llvm::TargetTransformInfo::getCFInstrCost(), llvm::TargetTransformInfo::getCmpSelInstrCost(), llvm::Type::getInt1Ty(), llvm::InstructionCost::getInvalid(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), llvm::TargetTransformInfo::getOperandInfo(), llvm::getReciprocalPredBlockProb(), I, llvm::LoopVectorizationLegality::isInvariant(), llvm::isSafeToSpeculativelyExecute(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), llvm::TargetTransformInfo::OperandValueInfo::Kind, Legal, llvm::TargetTransformInfo::OK_AnyValue, llvm::TargetTransformInfo::OK_UniformValue, Operands, llvm::TargetTransformInfo::TCK_RecipThroughput, and llvm::toVectorTy().
Referenced by getInstructionCost(), and isScalarWithPredication().
InstructionCost LoopVectorizationCostModel::getInstructionCost | ( | Instruction * | I, |
ElementCount | VF | ||
) |
Returns the execution time cost of an instruction for a given vector width.
Vector width of one means scalar.
Definition at line 6561 of file LoopVectorize.cpp.
References llvm::all_of(), assert(), llvm::CmpInst::BAD_ICMP_PREDICATE, canTruncateToMinimalBitwidth(), CM_GatherScatter, CM_Interleave, CM_IntrinsicCall, CM_Scalarize, CM_Unknown, CM_VectorCall, CM_Widen, CM_Widen_Reverse, llvm::MapVector< KeyT, ValueT, MapType, VectorType >::contains(), llvm::LoopBase< BlockT, LoopT >::contains(), CostKind, llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::count(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::end(), llvm::MapVector< KeyT, ValueT, MapType, VectorType >::find(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::find(), foldTailWithEVL(), llvm::TargetTransformInfo::GatherScatter, llvm::IntegerType::get(), llvm::VectorType::get(), llvm::APInt::getAllOnes(), llvm::TargetTransformInfo::getArithmeticInstrCost(), llvm::TargetTransformInfo::getCastInstrCost(), llvm::TargetTransformInfo::getCFInstrCost(), llvm::TargetTransformInfo::getCmpSelInstrCost(), llvm::Type::getContext(), getDivRemSpeculationCost(), llvm::ElementCount::getFixed(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getFixedValue(), llvm::LoopBase< BlockT, LoopT >::getHeader(), llvm::LoopVectorizationLegality::getHistogramInfo(), llvm::TargetTransformInfo::getInstructionCost(), getInstructionCost(), llvm::Type::getInt1Ty(), llvm::TargetTransformInfo::getIntrinsicInstrCost(), llvm::InstructionCost::getInvalid(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), llvm::getLoadStoreType(), llvm::LoopBase< BlockT, LoopT >::getLoopLatch(), llvm::TargetTransformInfo::getNumberOfParts(), llvm::TargetTransformInfo::getOperandInfo(), llvm::ilist_detail::node_parent_access< NodeTy, ParentTy >::getParent(), llvm::LoadInst::getPointerOperandType(), getReductionPatternCost(), llvm::LoopVectorizationLegality::getReductionVars(), llvm::TargetTransformInfo::getScalarizationOverhead(), llvm::Type::getScalarSizeInBits(), llvm::ScalarEvolution::getSCEV(), llvm::PredicatedScalarEvolution::getSCEV(), llvm::PredicatedScalarEvolution::getSE(), llvm::TargetTransformInfo::getShuffleCost(), llvm::BranchInst::getSuccessor(), llvm::Value::getType(), getVectorCallCost(), llvm::Type::getVoidTy(), getWideningCost(), getWideningDecision(), I, llvm::CmpInst::ICMP_EQ, Info, llvm::TargetTransformInfo::Interleave, llvm::RecurrenceDescriptor::isAnyOfRecurrenceKind(), llvm::BranchInst::isConditional(), isDivRemScalarWithPredication(), llvm::LoopVectorizationLegality::isFixedOrderRecurrence(), isInLoopReduction(), llvm::ScalarEvolution::isLoopInvariant(), llvm::LoopVectorizationLegality::isMaskRequired(), llvm::SCEV::isOne(), isOptimizableIVTruncate(), isPredicatedInst(), isProfitableToScalarize(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), llvm::ElementCount::isScalar(), isScalarAfterVectorization(), llvm::ScalarEvolution::isSCEVable(), isUniformAfterVectorization(), llvm::ElementCount::isVector(), llvm::Type::isVectorTy(), llvm::TargetTransformInfo::OperandValueInfo::Kind, Legal, llvm_unreachable, llvm::llvm_unreachable_internal(), llvm::HistogramInfo::Load, llvm::PatternMatch::m_LogicalAnd(), llvm::PatternMatch::m_LogicalOr(), llvm::PatternMatch::m_Value(), llvm::TargetTransformInfo::Masked, llvm::PatternMatch::match(), llvm::TargetTransformInfo::None, llvm::TargetTransformInfo::Normal, llvm::TargetTransformInfo::OK_AnyValue, llvm::TargetTransformInfo::OK_UniformValue, Operands, PSE, RetTy, llvm::TargetTransformInfo::Reversed, RHS, shouldConsiderInvariant(), llvm::TargetTransformInfo::SK_Splice, llvm::TargetTransformInfo::TCC_Free, llvm::TargetTransformInfo::TCK_RecipThroughput, TheLoop, TLI, and llvm::toVectorTy().
Referenced by expectedCost(), getInstructionCost(), and llvm::VPCostContext::getLegacyCost().
|
inline |
Get the interleaved access group that Instr
belongs to.
Definition at line 1379 of file LoopVectorize.cpp.
References llvm::InterleavedAccessInfo::getInterleaveGroup(), and InterleaveInfo.
Referenced by collectValuesToIgnore(), interleavedAccessCanBeWidened(), and setCostBasedWideningDecision().
|
inline |
Return maximum safe number of elements to be processed per vector iteration, which do not prevent store-load forwarding and are safe with regard to the memory dependencies.
Required for EVL-based VPlans to correctly calculate AVL (application vector length) as min(remaining AVL, MaxSafeElements). TODO: need to consider adjusting cost model to use this value as a vectorization factor for EVL-based vectorization.
Definition at line 1497 of file LoopVectorize.cpp.
|
inline |
Definition at line 1086 of file LoopVectorize.cpp.
std::optional< InstructionCost > LoopVectorizationCostModel::getReductionPatternCost | ( | Instruction * | I, |
ElementCount | VF, | ||
Type * | VectorTy, | ||
TTI::TargetCostKind | CostKind | ||
) | const |
Return the cost of instructions in an inloop reduction pattern, if I is part of that pattern.
Definition at line 5947 of file LoopVectorize.cpp.
References llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::at(), CostKind, llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::count(), llvm::SmallPtrSetImplBase::empty(), llvm::MapVector< KeyT, ValueT, MapType, VectorType >::find(), llvm::FMulAdd, llvm::VectorType::get(), llvm::TargetTransformInfo::getArithmeticInstrCost(), llvm::TargetTransformInfo::getArithmeticReductionCost(), llvm::TargetTransformInfo::getCastInstrCost(), llvm::TargetTransformInfo::getExtendedReductionCost(), llvm::RecurrenceDescriptor::getFastMathFlags(), llvm::Type::getIntegerBitWidth(), llvm::TargetTransformInfo::getMinMaxReductionCost(), llvm::getMinMaxReductionIntrinsicOp(), llvm::TargetTransformInfo::getMulAccReductionCost(), llvm::Instruction::getOpcode(), llvm::RecurrenceDescriptor::getOpcode(), llvm::User::getOperand(), llvm::RecurrenceDescriptor::getRecurrenceKind(), llvm::RecurrenceDescriptor::getRecurrenceType(), llvm::LoopVectorizationLegality::getReductionVars(), llvm::Value::getType(), llvm::Value::hasOneUser(), I, llvm::Loop::isLoopInvariant(), llvm::RecurrenceDescriptor::isMinMaxRecurrenceKind(), llvm::ElementCount::isScalar(), llvm::InstructionCost::isValid(), Legal, llvm::PatternMatch::m_Instruction(), llvm::PatternMatch::m_Mul(), llvm::PatternMatch::m_OneUse(), llvm::PatternMatch::m_Value(), llvm::PatternMatch::m_ZExtOrSExt(), llvm::PatternMatch::match(), llvm::TargetTransformInfo::None, TheLoop, useOrderedReductions(), and llvm::Instruction::user_back().
Referenced by getInstructionCost(), getVectorCallCost(), and setVectorizedCallDecision().
Definition at line 4946 of file LoopVectorize.cpp.
References DL, ElementTypesInLoop, llvm::MapVector< KeyT, ValueT, MapType, VectorType >::empty(), llvm::Function::getDataLayout(), llvm::RecurrenceDescriptor::getMinWidthCastToRecurrenceTypeInBits(), llvm::RecurrenceDescriptor::getRecurrenceType(), llvm::LoopVectorizationLegality::getReductionVars(), llvm::Type::getScalarSizeInBits(), Legal, and TheFunction.
Referenced by determineVPlanVF().
|
inline |
Returns the TailFoldingStyle that is best for the current loop.
Definition at line 1429 of file LoopVectorize.cpp.
References llvm::None.
Referenced by computeMaxVF(), foldTailByMasking(), and foldTailWithEVL().
InstructionCost LoopVectorizationCostModel::getVectorCallCost | ( | CallInst * | CI, |
ElementCount | VF | ||
) | const |
Estimate cost of a call instruction CI if it were vectorized with factor VF.
Return the cost of the instruction, including scalarization overhead if it's needed.
Definition at line 2985 of file LoopVectorize.cpp.
References llvm::CallBase::args(), CostKind, llvm::CallBase::getCalledFunction(), llvm::TargetTransformInfo::getCallInstrCost(), getReductionPatternCost(), llvm::Value::getType(), getVectorIntrinsicCost(), llvm::getVectorIntrinsicIDForCall(), llvm::RecurrenceDescriptor::isFMulAddIntrinsic(), llvm::ElementCount::isScalar(), RetTy, llvm::TargetTransformInfo::TCK_RecipThroughput, and TLI.
Referenced by getInstructionCost().
InstructionCost LoopVectorizationCostModel::getVectorIntrinsicCost | ( | CallInst * | CI, |
ElementCount | VF | ||
) | const |
Estimate cost of an intrinsic call instruction CI if it were vectorized with factor VF.
Return the cost of the instruction, including scalarization overhead if it's needed.
Definition at line 3020 of file LoopVectorize.cpp.
References llvm::CallBase::args(), Arguments, assert(), llvm::CallBase::getCalledFunction(), llvm::Function::getFunctionType(), llvm::TargetTransformInfo::getIntrinsicInstrCost(), llvm::Value::getType(), llvm::getVectorIntrinsicIDForCall(), maybeVectorizeType(), llvm::FunctionType::param_begin(), llvm::FunctionType::param_end(), RetTy, llvm::TargetTransformInfo::TCK_RecipThroughput, and TLI.
Referenced by getVectorCallCost(), and setVectorizedCallDecision().
|
inline |
Return the vectorization cost for the given instruction I
and vector width VF
.
Definition at line 1213 of file LoopVectorize.cpp.
References assert(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::contains(), I, and llvm::ElementCount::isVector().
Referenced by getInstructionCost().
|
inline |
Return the cost model decision for the given instruction I
and vector width VF
.
Return CM_Unknown if this instruction did not pass through the cost modeling.
Definition at line 1198 of file LoopVectorize.cpp.
References assert(), CM_Unknown, llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::end(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::find(), I, llvm::LoopBase< BlockT, LoopT >::isInnermost(), llvm::ElementCount::isVector(), and TheLoop.
Referenced by getInstructionCost(), interleavedAccessCanBeWidened(), and setCostBasedWideningDecision().
|
inline |
Definition at line 1553 of file LoopVectorize.cpp.
bool LoopVectorizationCostModel::interleavedAccessCanBeWidened | ( | Instruction * | I, |
ElementCount | VF | ||
) | const |
Returns true if I
is a memory instruction in an interleaved-group of memory accesses that can be vectorized with wide vector loads/stores and shuffles.
Definition at line 3563 of file LoopVectorize.cpp.
References assert(), blockNeedsPredicationForAnyReason(), CM_Unknown, DL, getInterleavedAccessGroup(), llvm::getLoadStoreAlignment(), llvm::getLoadStoreType(), getWideningDecision(), hasIrregularType(), I, Idx, isAccessInterleaved(), llvm::TargetTransformInfo::isLegalMaskedLoad(), llvm::TargetTransformInfo::isLegalMaskedStore(), llvm::LoopVectorizationLegality::isMaskRequired(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), isScalarEpilogueAllowed(), Legal, and useMaskedInterleavedAccesses().
Referenced by setCostBasedWideningDecision().
|
inline |
Invalidates decisions already taken by the cost model.
Definition at line 1540 of file LoopVectorize.cpp.
References llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::clear().
Referenced by llvm::LoopVectorizationPlanner::plan().
|
inline |
Check if Instr
belongs to any interleaved access group.
Definition at line 1373 of file LoopVectorize.cpp.
References InterleaveInfo, and llvm::InterleavedAccessInfo::isInterleaved().
Referenced by collectValuesToIgnore(), interleavedAccessCanBeWidened(), and setCostBasedWideningDecision().
|
inline |
Given costs for both strategies, return true if the scalar predication lowering should be used for div/rem.
This incorporates an override option so it is not simply a cost comparison.
Definition at line 1331 of file LoopVectorize.cpp.
References llvm::cl::BOU_FALSE, llvm::cl::BOU_TRUE, llvm::cl::BOU_UNSET, ForceSafeDivisor, and llvm_unreachable.
Referenced by getInstructionCost(), and isScalarWithPredication().
bool LoopVectorizationCostModel::isEpilogueVectorizationProfitable | ( | const ElementCount | VF, |
const unsigned | IC | ||
) | const |
Returns true if epilogue vectorization is considered profitable, and false otherwise.
VF
is the vectorization factor chosen for the original loop. Multiplier
is an aditional scaling factor applied to VF before comparing to EpilogueVectorizationMinVF.
Definition at line 4809 of file LoopVectorize.cpp.
References EpilogueVectorizationMinVF, llvm::TargetTransformInfo::getEpilogueVectorizationMinVF(), getEstimatedRuntimeVF(), llvm::TargetTransformInfo::getMaxInterleaveFactor(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isFixed(), llvm::TargetTransformInfo::preferEpilogueVectorization(), and TheLoop.
Referenced by llvm::LoopVectorizationPlanner::selectEpilogueVectorizationFactor().
Returns true if the Phi is part of an inloop reduction.
Definition at line 1513 of file LoopVectorize.cpp.
Referenced by getInstructionCost(), and llvm::VPRecipeBuilder::tryToCreateWidenRecipe().
|
inline |
Returns true if the target machine can represent V
as a masked gather or scatter operation.
Definition at line 1306 of file LoopVectorize.cpp.
References llvm::VectorType::get(), llvm::getLoadStoreAlignment(), llvm::getLoadStoreType(), llvm::TargetTransformInfo::isLegalMaskedGather(), llvm::TargetTransformInfo::isLegalMaskedScatter(), llvm::ElementCount::isVector(), and LI.
Referenced by setCostBasedWideningDecision().
|
inline |
Returns true if the target machine supports masked load operation for the given DataType
and kind of access to Ptr
.
Definition at line 1299 of file LoopVectorize.cpp.
References llvm::LoopVectorizationLegality::isConsecutivePtr(), llvm::TargetTransformInfo::isLegalMaskedLoad(), Legal, and Ptr.
Referenced by isScalarWithPredication().
|
inline |
Returns true if the target machine supports masked store operation for the given DataType
and kind of access to Ptr
.
Definition at line 1292 of file LoopVectorize.cpp.
References llvm::LoopVectorizationLegality::isConsecutivePtr(), llvm::TargetTransformInfo::isLegalMaskedStore(), Legal, and Ptr.
Referenced by isScalarWithPredication().
|
inline |
Return True if instruction I
is an optimizable truncate whose operand is an induction variable.
Such a truncate will be removed by adding a new induction variable with the destination type.
Definition at line 1247 of file LoopVectorize.cpp.
References llvm::LoopVectorizationLegality::getPrimaryInduction(), I, llvm::LoopVectorizationLegality::isInductionPhi(), llvm::TargetTransformInfo::isTruncateFree(), Legal, and llvm::toVectorTy().
Referenced by getInstructionCost().
bool LoopVectorizationCostModel::isPredicatedInst | ( | Instruction * | I | ) | const |
Returns true if I
is an instruction that needs to be predicated at runtime.
The result is independent of the predication mechanism. Superset of instructions that return true for isScalarWithPredication.
Definition at line 3447 of file LoopVectorize.cpp.
References assert(), llvm::LoopVectorizationLegality::blockNeedsPredication(), blockNeedsPredicationForAnyReason(), llvm::getLoadStorePointerOperand(), I, llvm::LoopVectorizationLegality::isInvariant(), llvm::Loop::isLoopInvariant(), llvm::LoopVectorizationLegality::isMaskRequired(), llvm::isSafeToSpeculativelyExecute(), Legal, llvm_unreachable, and TheLoop.
Referenced by getInstructionCost(), llvm::VPRecipeBuilder::handleReplication(), isScalarWithPredication(), and shouldConsiderInvariant().
|
inline |
I
for vectorization factor VF
. Definition at line 1092 of file LoopVectorize.cpp.
References assert(), I, llvm::LoopBase< BlockT, LoopT >::isInnermost(), llvm::ElementCount::isVector(), and TheLoop.
Referenced by canTruncateToMinimalBitwidth(), and getInstructionCost().
|
inline |
Returns true if I
is known to be scalar after vectorization.
Definition at line 1126 of file LoopVectorize.cpp.
References assert(), I, llvm::LoopBase< BlockT, LoopT >::isInnermost(), llvm::ElementCount::isScalar(), and TheLoop.
Referenced by calculateRegisterUsage(), canTruncateToMinimalBitwidth(), collectInstsToScalarize(), and getInstructionCost().
|
inline |
Returns true if a scalar epilogue is not allowed due to optsize or a loop hint annotation.
Definition at line 1424 of file LoopVectorize.cpp.
References llvm::CM_ScalarEpilogueAllowed.
Referenced by interleavedAccessCanBeWidened(), requiresScalarEpilogue(), llvm::LoopVectorizationPlanner::selectEpilogueVectorizationFactor(), and selectInterleaveCount().
bool LoopVectorizationCostModel::isScalarWithPredication | ( | Instruction * | I, |
ElementCount | VF | ||
) | const |
Returns true if I
is an instruction which requires predication and for which our chosen predication strategy is scalarization (i.e.
we don't have an alternate strategy such as masking available). VF
is the vectorization factor that will be used to vectorize I
.
Definition at line 3405 of file LoopVectorize.cpp.
References CM_Scalarize, llvm::VectorType::get(), getDivRemSpeculationCost(), llvm::getLoadStoreAlignment(), llvm::getLoadStorePointerOperand(), llvm::getLoadStoreType(), I, isDivRemScalarWithPredication(), llvm::TargetTransformInfo::isLegalMaskedGather(), isLegalMaskedLoad(), llvm::TargetTransformInfo::isLegalMaskedScatter(), isLegalMaskedStore(), isPredicatedInst(), llvm::ElementCount::isScalar(), llvm::ElementCount::isVector(), and Ptr.
Referenced by collectInstsToScalarize(), memoryInstructionCanBeWidened(), and setCostBasedWideningDecision().
|
inline |
Returns true if I
is known to be uniform after vectorization.
Definition at line 1106 of file LoopVectorize.cpp.
References assert(), I, llvm::LoopBase< BlockT, LoopT >::isInnermost(), llvm::ElementCount::isScalar(), and TheLoop.
Referenced by getInstructionCost(), llvm::VPRecipeBuilder::handleReplication(), and setVectorizedCallDecision().
bool LoopVectorizationCostModel::memoryInstructionCanBeWidened | ( | Instruction * | I, |
ElementCount | VF | ||
) |
Returns true if I
is a memory instruction with consecutive memory access that can be widened.
Definition at line 3637 of file LoopVectorize.cpp.
References assert(), DL, llvm::getLoadStorePointerOperand(), llvm::getLoadStoreType(), hasIrregularType(), I, llvm::LoopVectorizationLegality::isConsecutivePtr(), isScalarWithPredication(), Legal, and Ptr.
Referenced by setCostBasedWideningDecision().
Returns true if we're required to use a scalar epilogue for at least the final iteration of the original loop.
Definition at line 1385 of file LoopVectorize.cpp.
References llvm::dbgs(), EnableEarlyExitVectorization, llvm::LoopBase< BlockT, LoopT >::getExitingBlock(), llvm::LoopBase< BlockT, LoopT >::getLoopLatch(), llvm::LoopVectorizationLegality::hasUncountableEarlyExit(), InterleaveInfo, isScalarEpilogueAllowed(), Legal, LLVM_DEBUG, llvm::InterleavedAccessInfo::requiresScalarEpilogue(), and TheLoop.
Referenced by collectValuesToIgnore(), requiresScalarEpilogue(), and selectInterleaveCount().
Returns true if we're required to use a scalar epilogue for at least the final iteration of the original loop for all VFs in Range
.
A scalar epilogue must either be required for all VFs in Range
or for none.
Definition at line 1411 of file LoopVectorize.cpp.
References llvm::all_of(), assert(), llvm::none_of(), Range, and requiresScalarEpilogue().
bool LoopVectorizationCostModel::runtimeChecksRequired | ( | ) |
Definition at line 3896 of file LoopVectorize.cpp.
References llvm::dbgs(), llvm::LoopVectorizationLegality::getLAI(), llvm::PredicatedScalarEvolution::getPredicate(), llvm::LoopVectorizationLegality::getRuntimePointerChecking(), llvm::LoopAccessInfo::getSymbolicStrides(), llvm::SCEVPredicate::isAlwaysTrue(), Legal, LLVM_DEBUG, llvm::RuntimePointerChecking::Need, ORE, PSE, llvm::reportVectorizationFailure(), and TheLoop.
Referenced by computeMaxVF().
unsigned LoopVectorizationCostModel::selectInterleaveCount | ( | ElementCount | VF, |
InstructionCost | LoopCost | ||
) |
Definition at line 5021 of file LoopVectorize.cpp.
References llvm::any_of(), assert(), llvm::bit_floor(), llvm::LoopBase< BlockT, LoopT >::blocks(), calculateRegisterUsage(), llvm::dbgs(), llvm::MapVector< KeyT, ValueT, MapType, VectorType >::empty(), llvm::TargetTransformInfo::enableAggressiveInterleaving(), EnableIndVarRegisterHeur, EnableLoadStoreRuntimeInterleave, expectedCost(), F, foldTailWithEVL(), ForceTargetMaxScalarInterleaveFactor, ForceTargetMaxVectorInterleaveFactor, ForceTargetNumScalarRegs, ForceTargetNumVectorRegs, getEstimatedRuntimeVF(), llvm::LoopBase< BlockT, LoopT >::getLoopDepth(), llvm::TargetTransformInfo::getMaxInterleaveFactor(), llvm::TargetTransformInfo::getNumberOfRegisters(), llvm::LoopVectorizationLegality::getNumLoads(), llvm::LoopVectorizationLegality::getNumStores(), llvm::LoopVectorizationLegality::getReductionVars(), llvm::TargetTransformInfo::getRegisterClassName(), llvm::LoopVectorizationLegality::getRuntimePointerChecking(), llvm::PredicatedScalarEvolution::getSE(), getSmallBestKnownTC(), llvm::ScalarEvolution::getSmallConstantTripCount(), llvm::InstructionCost::getValue(), llvm::LoopVectorizationLegality::hasUncountableEarlyExit(), llvm::LoopVectorizationLegality::isSafeForAnyVectorWidth(), llvm::ElementCount::isScalar(), isScalarEpilogueAllowed(), llvm::InstructionCost::isValid(), llvm::ElementCount::isVector(), Legal, LLVM_DEBUG, MaxNestedScalarReductionIC, llvm::RuntimePointerChecking::Need, PSE, Reduction, requiresScalarEpilogue(), SmallLoopCost, and TheLoop.
Referenced by llvm::LoopVectorizePass::processLoop().
|
inline |
Setup cost-based decisions for user vectorization factor.
Definition at line 1016 of file LoopVectorize.cpp.
References collectInstsToScalarize(), collectUniformsAndScalars(), expectedCost(), and llvm::InstructionCost::isValid().
Referenced by llvm::LoopVectorizationPlanner::plan().
|
inline |
Definition at line 1229 of file LoopVectorize.cpp.
References assert(), and llvm::ElementCount::isScalar().
Referenced by setVectorizedCallDecision().
void LoopVectorizationCostModel::setCostBasedWideningDecision | ( | ElementCount | VF | ) |
Memory access instruction may be vectorized in more than one way.
Form of instruction after vectorization depends on cost. This function takes cost-based decisions for Load/Store instructions and collects them in a map. This decisions map is used for building the lists of loop-uniform and loop-scalar instructions. The calculated cost is saved with widening decision in order to avoid redundant calculations.
Definition at line 6187 of file LoopVectorize.cpp.
References llvm::append_range(), assert(), llvm::LoopBase< BlockT, LoopT >::blocks(), CM_GatherScatter, CM_Interleave, CM_Scalarize, CM_Unknown, CM_Widen, CM_Widen_Reverse, llvm::LoopBase< BlockT, LoopT >::contains(), llvm::SmallVectorBase< Size_T >::empty(), foldTailByMasking(), llvm::ElementCount::getFixed(), getInterleavedAccessGroup(), llvm::InstructionCost::getInvalid(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), llvm::getLoadStorePointerOperand(), llvm::getLoadStoreType(), getWideningDecision(), I, llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::insert(), llvm::SmallPtrSetImpl< PtrType >::insert(), interleavedAccessCanBeWidened(), isAccessInterleaved(), llvm::LoopVectorizationLegality::isConsecutivePtr(), isLegalGatherOrScatter(), llvm::Loop::isLoopInvariant(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), llvm::ElementCount::isScalar(), isScalarWithPredication(), llvm::LoopVectorizationLegality::isUniformMemOp(), Legal, memoryInstructionCanBeWidened(), llvm::SmallVectorImpl< T >::pop_back_val(), llvm::TargetTransformInfo::prefersVectorizedAddressing(), Ptr, llvm::SmallVectorTemplateBase< T, bool >::push_back(), setWideningDecision(), and TheLoop.
Referenced by collectUniformsAndScalars().
|
inline |
Selects and saves TailFoldingStyle for 2 options - if IV update may overflow or not.
IsScalableVF | true if scalable vector factors enabled. |
UserIC | User specific interleave count. |
Definition at line 1440 of file LoopVectorize.cpp.
References assert(), llvm::LoopVectorizationLegality::canFoldTailByMasking(), llvm::DataWithEVL, llvm::DataWithoutLaneMask, llvm::dbgs(), llvm::EnableVPlanNativePath, ForceTailFoldingStyle, llvm::TargetTransformInfo::getPreferredTailFoldingStyle(), llvm::TargetTransformInfo::hasActiveVectorLength(), Legal, LLVM_DEBUG, and llvm::None.
Referenced by computeMaxVF().
void LoopVectorizationCostModel::setVectorizedCallDecision | ( | ElementCount | VF | ) |
A call may be vectorized in different ways depending on whether we have vectorized variants available and whether the target supports masking.
This function analyzes all calls in the function at the supplied VF, makes a decision based on the costs of available options, and stores that decision in a map for use in planning and plan execution.
Definition at line 6374 of file LoopVectorize.cpp.
References llvm::CallBase::args(), assert(), llvm::LoopBase< BlockT, LoopT >::blocks(), CM_IntrinsicCall, CM_Scalarize, CM_VectorCall, CostKind, llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::end(), llvm::DenseMapBase< DerivedT, KeyT, ValueT, KeyInfoT, BucketT >::find(), llvm::VectorType::get(), llvm::SCEVConstant::getAPInt(), llvm::CallBase::getArgOperand(), llvm::CallBase::getCalledFunction(), llvm::TargetTransformInfo::getCallInstrCost(), llvm::Type::getContext(), llvm::Module::getFunction(), llvm::Function::getFunctionType(), llvm::Type::getInt1Ty(), llvm::InstructionCost::getInvalid(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), llvm::VFDatabase::getMappings(), llvm::Instruction::getModule(), llvm::VFInfo::getParamIndexForOptionalMask(), getReductionPatternCost(), llvm::ScalarEvolution::getSCEV(), llvm::PredicatedScalarEvolution::getSCEV(), llvm::PredicatedScalarEvolution::getSE(), llvm::APInt::getSExtValue(), llvm::TargetTransformInfo::getShuffleCost(), llvm::Value::getType(), getVectorIntrinsicCost(), llvm::getVectorIntrinsicIDForCall(), llvm::GlobalPredicate, I, Info, llvm::RecurrenceDescriptor::isFMulAddIntrinsic(), llvm::ScalarEvolution::isLoopInvariant(), llvm::LoopVectorizationLegality::isMaskRequired(), llvm::CallBase::isNoBuiltin(), llvm::ElementCount::isScalar(), isUniformAfterVectorization(), llvm::ElementCount::isVector(), Legal, llvm::Intrinsic::not_intrinsic, llvm::OMP_Linear, llvm::OMP_Uniform, PSE, llvm::SmallVectorTemplateBase< T, bool >::push_back(), RetTy, setCallWideningDecision(), llvm::TargetTransformInfo::SK_Broadcast, llvm::TargetTransformInfo::TCK_RecipThroughput, TheLoop, TLI, llvm::toVectorTy(), and llvm::Vector.
Referenced by collectUniformsAndScalars().
|
inline |
Save vectorization decision W
and Cost
taken by the cost model for interleaving group Grp
and vector width VF
.
Broadcast this decicion to all instructions inside the group. When interleaving, the cost will only be assigned one instruction, the insert position. For other cases, add the appropriate fraction of the total cost to each instruction. This ensures accurate costs are used, even if the insert position instruction is not used.
Definition at line 1169 of file LoopVectorize.cpp.
References assert(), CM_Interleave, llvm::InterleaveGroup< InstTy >::getFactor(), llvm::InterleaveGroup< InstTy >::getInsertPos(), llvm::InterleaveGroup< InstTy >::getMember(), llvm::InterleaveGroup< InstTy >::getNumMembers(), I, Idx, and llvm::ElementCount::isVector().
|
inline |
Save vectorization decision W
and Cost
taken by the cost model for instruction I
and vector width VF
.
Definition at line 1161 of file LoopVectorize.cpp.
References assert(), I, and llvm::ElementCount::isVector().
Referenced by setCostBasedWideningDecision().
Returns true if Op
should be considered invariant and if it is trivially hoistable.
Definition at line 6547 of file LoopVectorize.cpp.
References llvm::all_of(), llvm::LoopBase< BlockT, LoopT >::contains(), llvm::LoopBase< BlockT, LoopT >::getHeader(), llvm::LoopVectorizationLegality::isInvariant(), isPredicatedInst(), Legal, shouldConsiderInvariant(), and TheLoop.
Referenced by getInstructionCost(), and shouldConsiderInvariant().
|
inline |
Returns true if we should use strict in-order reductions for the given RdxDesc.
This is true if the -enable-strict-reductions flag is passed, the IsOrdered flag of RdxDesc is set and we do not allow reordering of FP operations.
Definition at line 1079 of file LoopVectorize.cpp.
References llvm::LoopVectorizeHints::allowReordering(), Hints, and llvm::RecurrenceDescriptor::isOrdered().
Referenced by collectElementTypesForWidening(), collectInLoopReductions(), getReductionPatternCost(), and llvm::VPRecipeBuilder::tryToCreateWidenRecipe().
|
inline |
Returns true if the predicated reduction select should be used to set the incoming value for the reduction phi.
Definition at line 1519 of file LoopVectorize.cpp.
References foldTailWithEVL(), PreferPredicatedReductionSelect, and llvm::TargetTransformInfo::preferPredicatedReductionSelect().
|
friend |
Definition at line 989 of file LoopVectorize.cpp.
AssumptionCache* llvm::LoopVectorizationCostModel::AC |
Assumption cache.
Definition at line 1789 of file LoopVectorize.cpp.
Referenced by collectValuesToIgnore().
DemandedBits* llvm::LoopVectorizationCostModel::DB |
Demanded bits analysis.
Definition at line 1786 of file LoopVectorize.cpp.
SmallPtrSet<Type *, 16> llvm::LoopVectorizationCostModel::ElementTypesInLoop |
All element types found in the loop.
Definition at line 1810 of file LoopVectorize.cpp.
Referenced by collectElementTypesForWidening(), and getSmallestAndWidestTypes().
const LoopVectorizeHints* llvm::LoopVectorizationCostModel::Hints |
Loop Vectorize Hint.
Definition at line 1797 of file LoopVectorize.cpp.
Referenced by useOrderedReductions().
InterleavedAccessInfo& llvm::LoopVectorizationCostModel::InterleaveInfo |
The interleave access information contains groups of interleaved accesses with the same stride and close to each other.
Definition at line 1801 of file LoopVectorize.cpp.
Referenced by computeMaxVF(), getInterleavedAccessGroup(), isAccessInterleaved(), llvm::LoopVectorizationPlanner::plan(), and requiresScalarEpilogue().
LoopVectorizationLegality* llvm::LoopVectorizationCostModel::Legal |
Vectorization legality.
Definition at line 1777 of file LoopVectorize.cpp.
Referenced by blockNeedsPredicationForAnyReason(), canVectorizeReductions(), collectElementTypesForWidening(), collectInLoopReductions(), collectValuesToIgnore(), computeMaxVF(), expectedCost(), getDivRemSpeculationCost(), getInstructionCost(), getReductionPatternCost(), getSmallestAndWidestTypes(), interleavedAccessCanBeWidened(), isLegalMaskedLoad(), isLegalMaskedStore(), isOptimizableIVTruncate(), isPredicatedInst(), memoryInstructionCanBeWidened(), requiresScalarEpilogue(), runtimeChecksRequired(), selectInterleaveCount(), setCostBasedWideningDecision(), setTailFoldingStyles(), setVectorizedCallDecision(), and shouldConsiderInvariant().
LoopInfo* llvm::LoopVectorizationCostModel::LI |
Loop Info analysis.
Definition at line 1774 of file LoopVectorize.cpp.
Referenced by calculateRegisterUsage(), collectValuesToIgnore(), and isLegalGatherOrScatter().
OptimizationRemarkEmitter* llvm::LoopVectorizationCostModel::ORE |
Interface to emit optimization remarks.
Definition at line 1792 of file LoopVectorize.cpp.
Referenced by computeMaxVF(), and runtimeChecksRequired().
PredicatedScalarEvolution& llvm::LoopVectorizationCostModel::PSE |
Predicated scalar evolution analysis.
Definition at line 1771 of file LoopVectorize.cpp.
Referenced by computeMaxVF(), expectedCost(), getInstructionCost(), runtimeChecksRequired(), selectInterleaveCount(), and setVectorizedCallDecision().
Definition at line 1794 of file LoopVectorize.cpp.
Referenced by computeMaxVF(), and getSmallestAndWidestTypes().
Loop* llvm::LoopVectorizationCostModel::TheLoop |
The loop that we evaluate.
Definition at line 1768 of file LoopVectorize.cpp.
Referenced by calculateRegisterUsage(), collectElementTypesForWidening(), collectInLoopReductions(), collectInstsToScalarize(), collectValuesToIgnore(), computeMaxVF(), expectedCost(), getInstructionCost(), getReductionPatternCost(), getWideningDecision(), isEpilogueVectorizationProfitable(), isPredicatedInst(), isProfitableToScalarize(), isScalarAfterVectorization(), isUniformAfterVectorization(), requiresScalarEpilogue(), runtimeChecksRequired(), selectInterleaveCount(), setCostBasedWideningDecision(), setVectorizedCallDecision(), and shouldConsiderInvariant().
const TargetLibraryInfo* llvm::LoopVectorizationCostModel::TLI |
Target Library Info.
Definition at line 1783 of file LoopVectorize.cpp.
Referenced by collectValuesToIgnore(), llvm::LoopVectorizationPlanner::computeBestVF(), llvm::LoopVectorizationPlanner::emitInvalidCostRemarks(), getInstructionCost(), getVectorCallCost(), getVectorIntrinsicCost(), and setVectorizedCallDecision().
const TargetTransformInfo& llvm::LoopVectorizationCostModel::TTI |
Vector target information.
Definition at line 1780 of file LoopVectorize.cpp.
Referenced by calculateRegisterUsage(), llvm::LoopVectorizationPlanner::computeBestVF(), and llvm::LoopVectorizationPlanner::emitInvalidCostRemarks().
SmallPtrSet<const Value *, 16> llvm::LoopVectorizationCostModel::ValuesToIgnore |
Values to ignore in the cost model.
Definition at line 1804 of file LoopVectorize.cpp.
Referenced by calculateRegisterUsage(), collectElementTypesForWidening(), collectValuesToIgnore(), expectedCost(), and llvm::VPCostContext::skipCostComputation().
SmallPtrSet<const Value *, 16> llvm::LoopVectorizationCostModel::VecValuesToIgnore |
Values to ignore in the cost model when VF > 1.
Definition at line 1807 of file LoopVectorize.cpp.
Referenced by calculateRegisterUsage(), collectValuesToIgnore(), expectedCost(), and llvm::VPCostContext::skipCostComputation().