LLVM 22.0.0git
llvm::LoopVectorizationCostModel Class Reference

LoopVectorizationCostModel - estimates the expected speedups due to vectorization. More...

Classes

struct  CallWideningDecision

Public Types

enum  InstWidening {
  CM_Unknown , CM_Widen , CM_Widen_Reverse , CM_Interleave ,
  CM_GatherScatter , CM_Scalarize , CM_VectorCall , CM_IntrinsicCall
}
 Decision that was taken during cost calculation for memory instruction. More...

Public Member Functions

 LoopVectorizationCostModel (ScalarEpilogueLowering SEL, Loop *L, PredicatedScalarEvolution &PSE, LoopInfo *LI, LoopVectorizationLegality *Legal, const TargetTransformInfo &TTI, const TargetLibraryInfo *TLI, DemandedBits *DB, AssumptionCache *AC, OptimizationRemarkEmitter *ORE, const Function *F, const LoopVectorizeHints *Hints, InterleavedAccessInfo &IAI, ProfileSummaryInfo *PSI, BlockFrequencyInfo *BFI)
FixedScalableVFPair computeMaxVF (ElementCount UserVF, unsigned UserIC)
bool runtimeChecksRequired ()
bool selectUserVectorizationFactor (ElementCount UserVF)
 Setup cost-based decisions for user vectorization factor.
bool useMaxBandwidth (TargetTransformInfo::RegisterKind RegKind)
bool shouldConsiderRegPressureForVF (ElementCount VF)
std::pair< unsigned, unsignedgetSmallestAndWidestTypes ()
void setCostBasedWideningDecision (ElementCount VF)
 Memory access instruction may be vectorized in more than one way.
void setVectorizedCallDecision (ElementCount VF)
 A call may be vectorized in different ways depending on whether we have vectorized variants available and whether the target supports masking.
void collectValuesToIgnore ()
 Collect values we want to ignore in the cost model.
void collectElementTypesForWidening ()
 Collect all element types in the loop for which widening is needed.
void collectInLoopReductions ()
 Split reductions into those that happen in the loop, and those that happen outside.
bool useOrderedReductions (const RecurrenceDescriptor &RdxDesc) const
 Returns true if we should use strict in-order reductions for the given RdxDesc.
const MapVector< Instruction *, uint64_t > & getMinimalBitwidths () const
bool isProfitableToScalarize (Instruction *I, ElementCount VF) const
bool isUniformAfterVectorization (Instruction *I, ElementCount VF) const
 Returns true if I is known to be uniform after vectorization.
bool isScalarAfterVectorization (Instruction *I, ElementCount VF) const
 Returns true if I is known to be scalar after vectorization.
bool canTruncateToMinimalBitwidth (Instruction *I, ElementCount VF) const
void setWideningDecision (Instruction *I, ElementCount VF, InstWidening W, InstructionCost Cost)
 Save vectorization decision W and Cost taken by the cost model for instruction I and vector width VF.
void setWideningDecision (const InterleaveGroup< Instruction > *Grp, ElementCount VF, InstWidening W, InstructionCost Cost)
 Save vectorization decision W and Cost taken by the cost model for interleaving group Grp and vector width VF.
InstWidening getWideningDecision (Instruction *I, ElementCount VF) const
 Return the cost model decision for the given instruction I and vector width VF.
InstructionCost getWideningCost (Instruction *I, ElementCount VF)
 Return the vectorization cost for the given instruction I and vector width VF.
void setCallWideningDecision (CallInst *CI, ElementCount VF, InstWidening Kind, Function *Variant, Intrinsic::ID IID, std::optional< unsigned > MaskPos, InstructionCost Cost)
CallWideningDecision getCallWideningDecision (CallInst *CI, ElementCount VF) const
bool isOptimizableIVTruncate (Instruction *I, ElementCount VF)
 Return True if instruction I is an optimizable truncate whose operand is an induction variable.
void collectInstsToScalarize (ElementCount VF)
 Collects the instructions to scalarize for each predicated instruction in the loop.
void collectNonVectorizedAndSetWideningDecisions (ElementCount VF)
 Collect values that will not be widened, including Uniforms, Scalars, and Instructions to Scalarize for the given VF.
bool isLegalMaskedStore (Type *DataType, Value *Ptr, Align Alignment, unsigned AddressSpace) const
 Returns true if the target machine supports masked store operation for the given DataType and kind of access to Ptr.
bool isLegalMaskedLoad (Type *DataType, Value *Ptr, Align Alignment, unsigned AddressSpace) const
 Returns true if the target machine supports masked load operation for the given DataType and kind of access to Ptr.
bool isLegalGatherOrScatter (Value *V, ElementCount VF)
 Returns true if the target machine can represent V as a masked gather or scatter operation.
bool canVectorizeReductions (ElementCount VF) const
 Returns true if the target machine supports all of the reduction variables found for the given VF.
bool isDivRemScalarWithPredication (InstructionCost ScalarCost, InstructionCost SafeDivisorCost) const
 Given costs for both strategies, return true if the scalar predication lowering should be used for div/rem.
bool isScalarWithPredication (Instruction *I, ElementCount VF) const
 Returns true if I is an instruction which requires predication and for which our chosen predication strategy is scalarization (i.e.
bool isPredicatedInst (Instruction *I) const
 Returns true if I is an instruction that needs to be predicated at runtime.
std::pair< InstructionCost, InstructionCostgetDivRemSpeculationCost (Instruction *I, ElementCount VF) const
 Return the costs for our two available strategies for lowering a div/rem operation which requires speculating at least one lane.
bool memoryInstructionCanBeWidened (Instruction *I, ElementCount VF)
 Returns true if I is a memory instruction with consecutive memory access that can be widened.
bool interleavedAccessCanBeWidened (Instruction *I, ElementCount VF) const
 Returns true if I is a memory instruction in an interleaved-group of memory accesses that can be vectorized with wide vector loads/stores and shuffles.
bool isAccessInterleaved (Instruction *Instr) const
 Check if Instr belongs to any interleaved access group.
const InterleaveGroup< Instruction > * getInterleavedAccessGroup (Instruction *Instr) const
 Get the interleaved access group that Instr belongs to.
bool requiresScalarEpilogue (bool IsVectorizing) const
 Returns true if we're required to use a scalar epilogue for at least the final iteration of the original loop.
bool isScalarEpilogueAllowed () const
 Returns true if a scalar epilogue is not allowed due to optsize or a loop hint annotation.
TailFoldingStyle getTailFoldingStyle (bool IVUpdateMayOverflow=true) const
 Returns the TailFoldingStyle that is best for the current loop.
void setTailFoldingStyles (bool IsScalableVF, unsigned UserIC)
 Selects and saves TailFoldingStyle for 2 options - if IV update may overflow or not.
bool foldTailByMasking () const
 Returns true if all loop blocks should be masked to fold tail loop.
std::optional< unsignedgetMaxSafeElements () const
 Return maximum safe number of elements to be processed per vector iteration, which do not prevent store-load forwarding and are safe with regard to the memory dependencies.
bool blockNeedsPredicationForAnyReason (BasicBlock *BB) const
 Returns true if the instructions in this block requires predication for any reason, e.g.
bool foldTailWithEVL () const
 Returns true if VP intrinsics with explicit vector length support should be generated in the tail folded loop.
bool isInLoopReduction (PHINode *Phi) const
 Returns true if the Phi is part of an inloop reduction.
bool usePredicatedReductionSelect () const
 Returns true if the predicated reduction select should be used to set the incoming value for the reduction phi.
InstructionCost getVectorIntrinsicCost (CallInst *CI, ElementCount VF) const
 Estimate cost of an intrinsic call instruction CI if it were vectorized with factor VF.
InstructionCost getVectorCallCost (CallInst *CI, ElementCount VF) const
 Estimate cost of a call instruction CI if it were vectorized with factor VF.
void invalidateCostModelingDecisions ()
 Invalidates decisions already taken by the cost model.
InstructionCost expectedCost (ElementCount VF)
 Returns the expected execution cost.
bool hasPredStores () const
bool isEpilogueVectorizationProfitable (const ElementCount VF, const unsigned IC) const
 Returns true if epilogue vectorization is considered profitable, and false otherwise.
InstructionCost getInstructionCost (Instruction *I, ElementCount VF)
 Returns the execution time cost of an instruction for a given vector width.
std::optional< InstructionCostgetReductionPatternCost (Instruction *I, ElementCount VF, Type *VectorTy) const
 Return the cost of instructions in an inloop reduction pattern, if I is part of that pattern.
bool shouldConsiderInvariant (Value *Op)
 Returns true if Op should be considered invariant and if it is trivially hoistable.
std::optional< unsignedgetVScaleForTuning () const
 Return the value of vscale used for tuning the cost model.

Public Attributes

LoopTheLoop
 The loop that we evaluate.
PredicatedScalarEvolutionPSE
 Predicated scalar evolution analysis.
LoopInfoLI
 Loop Info analysis.
LoopVectorizationLegalityLegal
 Vectorization legality.
const TargetTransformInfoTTI
 Vector target information.
const TargetLibraryInfoTLI
 Target Library Info.
DemandedBitsDB
 Demanded bits analysis.
AssumptionCacheAC
 Assumption cache.
OptimizationRemarkEmitterORE
 Interface to emit optimization remarks.
const FunctionTheFunction
const LoopVectorizeHintsHints
 Loop Vectorize Hint.
InterleavedAccessInfoInterleaveInfo
 The interleave access information contains groups of interleaved accesses with the same stride and close to each other.
SmallPtrSet< const Value *, 16 > ValuesToIgnore
 Values to ignore in the cost model.
SmallPtrSet< const Value *, 16 > VecValuesToIgnore
 Values to ignore in the cost model when VF > 1.
SmallPtrSet< Type *, 16 > ElementTypesInLoop
 All element types found in the loop.
TTI::TargetCostKind CostKind
 The kind of cost that we are calculating.
bool OptForSize
 Whether this loop should be optimized for size based on function attribute or profile information.
FixedScalableVFPair MaxPermissibleVFWithoutMaxBW
 The highest VF possible for this loop, without using MaxBandwidth.

Friends

class LoopVectorizationPlanner

Detailed Description

LoopVectorizationCostModel - estimates the expected speedups due to vectorization.

In many cases vectorization is not profitable. This can happen because of a number of reasons. In this class we mainly attempt to predict the expected speedup/slowdowns due to the supported instruction set. We use the TargetTransformInfo to query the different backends for the cost of different operations.

Definition at line 884 of file LoopVectorize.cpp.

Member Enumeration Documentation

◆ InstWidening

Decision that was taken during cost calculation for memory instruction.

Enumerator
CM_Unknown 
CM_Widen 
CM_Widen_Reverse 
CM_Interleave 
CM_GatherScatter 
CM_Scalarize 
CM_VectorCall 
CM_IntrinsicCall 

Definition at line 1037 of file LoopVectorize.cpp.

Constructor & Destructor Documentation

◆ LoopVectorizationCostModel()

Member Function Documentation

◆ blockNeedsPredicationForAnyReason()

bool llvm::LoopVectorizationCostModel::blockNeedsPredicationForAnyReason ( BasicBlock * BB) const
inline

Returns true if the instructions in this block requires predication for any reason, e.g.

because tail folding now requires a predicate or because the block in the original loop was predicated.

Definition at line 1378 of file LoopVectorize.cpp.

References foldTailByMasking(), and Legal.

Referenced by collectInstsToScalarize(), and interleavedAccessCanBeWidened().

◆ canTruncateToMinimalBitwidth()

bool llvm::LoopVectorizationCostModel::canTruncateToMinimalBitwidth ( Instruction * I,
ElementCount VF ) const
inline
Returns
True if instruction I can be truncated to a smaller bitwidth for vectorization factor VF.

Definition at line 1030 of file LoopVectorize.cpp.

References I, isProfitableToScalarize(), isScalarAfterVectorization(), and llvm::ElementCount::isVector().

Referenced by getInstructionCost().

◆ canVectorizeReductions()

bool llvm::LoopVectorizationCostModel::canVectorizeReductions ( ElementCount VF) const
inline

Returns true if the target machine supports all of the reduction variables found for the given VF.

Definition at line 1214 of file LoopVectorize.cpp.

References llvm::all_of(), and Legal.

◆ collectElementTypesForWidening()

void LoopVectorizationCostModel::collectElementTypesForWidening ( )

◆ collectInLoopReductions()

void LoopVectorizationCostModel::collectInLoopReductions ( )

Split reductions into those that happen in the loop, and those that happen outside.

In loop reductions are collected into InLoopReductions.

Definition at line 6493 of file LoopVectorize.cpp.

References llvm::dbgs(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), llvm::RecurrenceDescriptor::getRecurrenceKind(), llvm::RecurrenceDescriptor::getRecurrenceType(), llvm::RecurrenceDescriptor::getReductionOpChain(), I, Legal, LLVM_DEBUG, PreferInLoopReductions, TheLoop, TTI, and useOrderedReductions().

◆ collectInstsToScalarize()

◆ collectNonVectorizedAndSetWideningDecisions()

void llvm::LoopVectorizationCostModel::collectNonVectorizedAndSetWideningDecisions ( ElementCount VF)
inline

Collect values that will not be widened, including Uniforms, Scalars, and Instructions to Scalarize for the given VF.

The sets depend on CM decision for Load/Store instructions that may be vectorized as interleave, gather-scatter or scalarized. Also make a decision on what to do about call instructions in the loop at that VF – scalarize, call a known vector routine, or call a vector intrinsic.

Definition at line 1170 of file LoopVectorize.cpp.

References collectInstsToScalarize(), llvm::ElementCount::isScalar(), setCostBasedWideningDecision(), and setVectorizedCallDecision().

Referenced by selectUserVectorizationFactor().

◆ collectValuesToIgnore()

◆ computeMaxVF()

FixedScalableVFPair LoopVectorizationCostModel::computeMaxVF ( ElementCount UserVF,
unsigned UserIC )
Returns
An upper bound for the vectorization factors (both fixed and scalable). If the factors are 0, vectorization and interleaving should be avoided up front.

Definition at line 3487 of file LoopVectorize.cpp.

References llvm::ScalarEvolution::applyLoopGuards(), assert(), llvm::CM_ScalarEpilogueAllowed, llvm::CM_ScalarEpilogueNotAllowedLowTripLoop, llvm::CM_ScalarEpilogueNotAllowedOptSize, llvm::CM_ScalarEpilogueNotAllowedUsePredicate, llvm::CM_ScalarEpilogueNotNeededUsePredicate, llvm::DataWithEVL, llvm::dbgs(), llvm::FixedScalableVFPair::FixedVF, foldTailByMasking(), llvm::ScalarEvolution::getAddExpr(), llvm::ScalarEvolution::getBackedgeTakenCount(), llvm::ScalarEvolution::getConstant(), llvm::ElementCount::getFixed(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getFixedValue(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), getMaxVScale(), llvm::ScalarEvolution::getMinusOne(), llvm::FixedScalableVFPair::getNone(), llvm::ScalarEvolution::getOne(), llvm::ElementCount::getScalable(), llvm::Type::getScalarSizeInBits(), getSmallBestKnownTC(), getSmallConstantTripCount(), getTailFoldingStyle(), llvm::SCEV::getType(), llvm::ScalarEvolution::getURemExpr(), llvm::CmpInst::ICMP_EQ, InterleaveInfo, llvm::isa(), llvm::ScalarEvolution::isKnownPredicate(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isNonZero(), llvm::isPowerOf2_32(), llvm::ElementCount::isScalar(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isZero(), llvm::SCEV::isZero(), Legal, LLVM_DEBUG, ORE, PSE, llvm::reportVectorizationFailure(), runtimeChecksRequired(), llvm::FixedScalableVFPair::ScalableVF, setTailFoldingStyles(), TheFunction, TheLoop, TTI, and useMaskedInterleavedAccesses().

◆ expectedCost()

InstructionCost LoopVectorizationCostModel::expectedCost ( ElementCount VF)

Returns the expected execution cost.

The unit of the cost does not matter because we use the 'cost' units to compare different vector widths. The cost that is returned is not normalized by the factor width.

Definition at line 5033 of file LoopVectorize.cpp.

References addFullyUnrolledInstructionsToIgnore(), llvm::CallingConv::C, CostKind, llvm::SmallPtrSetImpl< PtrType >::count(), llvm::dbgs(), foldTailByMasking(), llvm::ForceTargetInstructionCost, getInstructionCost(), llvm::getPredBlockCostDivisor(), getSmallConstantTripCount(), I, InstructionCost, llvm::ElementCount::isScalar(), llvm::ElementCount::isVector(), Legal, LLVM_DEBUG, PSE, TheLoop, ValuesToIgnore, and VecValuesToIgnore.

Referenced by selectUserVectorizationFactor().

◆ foldTailByMasking()

bool llvm::LoopVectorizationCostModel::foldTailByMasking ( ) const
inline

Returns true if all loop blocks should be masked to fold tail loop.

Definition at line 1360 of file LoopVectorize.cpp.

References getTailFoldingStyle(), and llvm::None.

Referenced by blockNeedsPredicationForAnyReason(), computeMaxVF(), expectedCost(), isPredicatedInst(), and setCostBasedWideningDecision().

◆ foldTailWithEVL()

bool llvm::LoopVectorizationCostModel::foldTailWithEVL ( ) const
inline

Returns true if VP intrinsics with explicit vector length support should be generated in the tail folded loop.

Definition at line 1384 of file LoopVectorize.cpp.

References llvm::DataWithEVL, and getTailFoldingStyle().

Referenced by getInstructionCost(), and usePredicatedReductionSelect().

◆ getCallWideningDecision()

CallWideningDecision llvm::LoopVectorizationCostModel::getCallWideningDecision ( CallInst * CI,
ElementCount VF ) const
inline

◆ getDivRemSpeculationCost()

std::pair< InstructionCost, InstructionCost > LoopVectorizationCostModel::getDivRemSpeculationCost ( Instruction * I,
ElementCount VF ) const

Return the costs for our two available strategies for lowering a div/rem operation which requires speculating at least one lane.

First result is for scalarization (will be invalid for scalable vectors); second is for the safe-divisor strategy.

Definition at line 2863 of file LoopVectorize.cpp.

References assert(), llvm::CmpInst::BAD_ICMP_PREDICATE, CostKind, llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getFixedValue(), llvm::Type::getInt1Ty(), llvm::InstructionCost::getInvalid(), llvm::getPredBlockCostDivisor(), I, llvm::isSafeToSpeculativelyExecute(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), Operands, llvm::toVectorTy(), and TTI.

Referenced by getInstructionCost(), and isScalarWithPredication().

◆ getInstructionCost()

InstructionCost LoopVectorizationCostModel::getInstructionCost ( Instruction * I,
ElementCount VF )

Returns the execution time cost of an instruction for a given vector width.

Vector width of one means scalar.

Definition at line 5904 of file LoopVectorize.cpp.

References llvm::all_of(), assert(), llvm::CmpInst::BAD_ICMP_PREDICATE, canTruncateToMinimalBitwidth(), llvm::cast(), llvm::cast_if_present(), CM_GatherScatter, CM_Interleave, CM_IntrinsicCall, CM_Scalarize, CM_Unknown, CM_VectorCall, CM_Widen, CM_Widen_Reverse, CostKind, llvm::dyn_cast(), llvm::find_singleton(), foldTailWithEVL(), llvm::TargetTransformInfo::GatherScatter, llvm::IntegerType::get(), llvm::VectorType::get(), llvm::APInt::getAllOnes(), llvm::Type::getContext(), getDivRemSpeculationCost(), llvm::ElementCount::getFixed(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getFixedValue(), getInstructionCost(), llvm::Type::getInt1Ty(), llvm::InstructionCost::getInvalid(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), llvm::getLoadStoreType(), llvm::TargetTransformInfo::getOperandInfo(), llvm::ilist_detail::node_parent_access< NodeTy, ParentTy >::getParent(), llvm::LoadInst::getPointerOperandType(), getReductionPatternCost(), llvm::Type::getScalarSizeInBits(), llvm::ScalarEvolution::getSCEV(), llvm::BranchInst::getSuccessor(), llvm::Value::getType(), getVectorCallCost(), llvm::Type::getVoidTy(), getWideningCost(), getWideningDecision(), I, llvm::CmpInst::ICMP_EQ, llvm::TargetTransformInfo::Interleave, llvm::isa(), llvm::RecurrenceDescriptor::isAnyOfRecurrenceKind(), llvm::BranchInst::isConditional(), isDivRemScalarWithPredication(), isInLoopReduction(), llvm::ScalarEvolution::isLoopInvariant(), isOptimizableIVTruncate(), isPredicatedInst(), isProfitableToScalarize(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), llvm::ElementCount::isScalar(), isScalarAfterVectorization(), isUniformAfterVectorization(), llvm::ElementCount::isVector(), llvm::Type::isVectorTy(), Legal, llvm_unreachable, llvm::llvm_unreachable_internal(), llvm::HistogramInfo::Load, llvm::PatternMatch::m_LogicalAnd(), llvm::PatternMatch::m_LogicalOr(), llvm::PatternMatch::m_Value(), llvm::CmpInst::makeCmpResultType(), llvm::TargetTransformInfo::Masked, llvm::PatternMatch::match(), llvm::TargetTransformInfo::None, llvm::TargetTransformInfo::Normal, llvm::TargetTransformInfo::OK_AnyValue, llvm::TargetTransformInfo::OK_UniformValue, Operands, PSE, llvm::TargetTransformInfo::Reversed, shouldConsiderInvariant(), llvm::TargetTransformInfo::SK_Splice, llvm::TargetTransformInfo::TCC_Free, TheLoop, TLI, llvm::toVectorizedTy(), llvm::toVectorTy(), and TTI.

Referenced by expectedCost(), and getInstructionCost().

◆ getInterleavedAccessGroup()

const InterleaveGroup< Instruction > * llvm::LoopVectorizationCostModel::getInterleavedAccessGroup ( Instruction * Instr) const
inline

Get the interleaved access group that Instr belongs to.

Definition at line 1272 of file LoopVectorize.cpp.

References InterleaveInfo.

Referenced by collectValuesToIgnore(), interleavedAccessCanBeWidened(), and setCostBasedWideningDecision().

◆ getMaxSafeElements()

std::optional< unsigned > llvm::LoopVectorizationCostModel::getMaxSafeElements ( ) const
inline

Return maximum safe number of elements to be processed per vector iteration, which do not prevent store-load forwarding and are safe with regard to the memory dependencies.

Required for EVL-based VPlans to correctly calculate AVL (application vector length) as min(remaining AVL, MaxSafeElements). TODO: need to consider adjusting cost model to use this value as a vectorization factor for EVL-based vectorization.

Definition at line 1373 of file LoopVectorize.cpp.

◆ getMinimalBitwidths()

const MapVector< Instruction *, uint64_t > & llvm::LoopVectorizationCostModel::getMinimalBitwidths ( ) const
inline
Returns
The smallest bitwidth each instruction can be represented with. The vector equivalents of these instructions should be truncated to this type.

Definition at line 975 of file LoopVectorize.cpp.

◆ getReductionPatternCost()

◆ getSmallestAndWidestTypes()

std::pair< unsigned, unsigned > LoopVectorizationCostModel::getSmallestAndWidestTypes ( )
Returns
The size (in bits) of the smallest and widest types in the code that needs to be vectorized. We ignore values that remain scalar such as 64 bit loop indices.

Definition at line 4436 of file LoopVectorize.cpp.

References DL, ElementTypesInLoop, llvm::RecurrenceDescriptor::getMinWidthCastToRecurrenceTypeInBits(), llvm::RecurrenceDescriptor::getRecurrenceType(), llvm::Type::getScalarSizeInBits(), Legal, T, and TheFunction.

Referenced by determineVPlanVF().

◆ getTailFoldingStyle()

TailFoldingStyle llvm::LoopVectorizationCostModel::getTailFoldingStyle ( bool IVUpdateMayOverflow = true) const
inline

Returns the TailFoldingStyle that is best for the current loop.

Definition at line 1307 of file LoopVectorize.cpp.

References llvm::None.

Referenced by computeMaxVF(), foldTailByMasking(), and foldTailWithEVL().

◆ getVectorCallCost()

InstructionCost LoopVectorizationCostModel::getVectorCallCost ( CallInst * CI,
ElementCount VF ) const

◆ getVectorIntrinsicCost()

InstructionCost LoopVectorizationCostModel::getVectorIntrinsicCost ( CallInst * CI,
ElementCount VF ) const

Estimate cost of an intrinsic call instruction CI if it were vectorized with factor VF.

Return the cost of the instruction, including scalarization overhead if it's needed.

Definition at line 2524 of file LoopVectorize.cpp.

References llvm::CallBase::args(), Arguments, assert(), CostKind, llvm::dyn_cast(), llvm::CallBase::getCalledFunction(), llvm::Function::getFunctionType(), llvm::InstructionCost::getInvalid(), llvm::Value::getType(), llvm::getVectorIntrinsicIDForCall(), maybeVectorizeType(), llvm::FunctionType::param_begin(), llvm::FunctionType::param_end(), TLI, and TTI.

Referenced by getVectorCallCost(), and setVectorizedCallDecision().

◆ getVScaleForTuning()

std::optional< unsigned > llvm::LoopVectorizationCostModel::getVScaleForTuning ( ) const
inline

Return the value of vscale used for tuning the cost model.

Definition at line 1453 of file LoopVectorize.cpp.

Referenced by llvm::LoopVectorizePass::processLoop().

◆ getWideningCost()

InstructionCost llvm::LoopVectorizationCostModel::getWideningCost ( Instruction * I,
ElementCount VF )
inline

Return the vectorization cost for the given instruction I and vector width VF.

Definition at line 1100 of file LoopVectorize.cpp.

References assert(), I, and llvm::ElementCount::isVector().

Referenced by getInstructionCost().

◆ getWideningDecision()

InstWidening llvm::LoopVectorizationCostModel::getWideningDecision ( Instruction * I,
ElementCount VF ) const
inline

Return the cost model decision for the given instruction I and vector width VF.

Return CM_Unknown if this instruction did not pass through the cost modeling.

Definition at line 1085 of file LoopVectorize.cpp.

References assert(), CM_Unknown, I, llvm::ElementCount::isVector(), and TheLoop.

Referenced by getInstructionCost(), interleavedAccessCanBeWidened(), and setCostBasedWideningDecision().

◆ hasPredStores()

bool llvm::LoopVectorizationCostModel::hasPredStores ( ) const
inline

Definition at line 1428 of file LoopVectorize.cpp.

◆ interleavedAccessCanBeWidened()

bool LoopVectorizationCostModel::interleavedAccessCanBeWidened ( Instruction * I,
ElementCount VF ) const

◆ invalidateCostModelingDecisions()

void llvm::LoopVectorizationCostModel::invalidateCostModelingDecisions ( )
inline

Invalidates decisions already taken by the cost model.

Definition at line 1415 of file LoopVectorize.cpp.

◆ isAccessInterleaved()

bool llvm::LoopVectorizationCostModel::isAccessInterleaved ( Instruction * Instr) const
inline

Check if Instr belongs to any interleaved access group.

Definition at line 1266 of file LoopVectorize.cpp.

References InterleaveInfo.

Referenced by collectValuesToIgnore(), interleavedAccessCanBeWidened(), and setCostBasedWideningDecision().

◆ isDivRemScalarWithPredication()

bool llvm::LoopVectorizationCostModel::isDivRemScalarWithPredication ( InstructionCost ScalarCost,
InstructionCost SafeDivisorCost ) const
inline

Given costs for both strategies, return true if the scalar predication lowering should be used for div/rem.

This incorporates an override option so it is not simply a cost comparison.

Definition at line 1224 of file LoopVectorize.cpp.

References llvm::cl::BOU_FALSE, llvm::cl::BOU_TRUE, llvm::cl::BOU_UNSET, ForceSafeDivisor, and llvm_unreachable.

Referenced by getInstructionCost(), and isScalarWithPredication().

◆ isEpilogueVectorizationProfitable()

bool LoopVectorizationCostModel::isEpilogueVectorizationProfitable ( const ElementCount VF,
const unsigned IC ) const

Returns true if epilogue vectorization is considered profitable, and false otherwise.

VF is the vectorization factor chosen for the original loop. Multiplier is an aditional scaling factor applied to VF before comparing to EpilogueVectorizationMinVF.

Definition at line 4295 of file LoopVectorize.cpp.

References EpilogueVectorizationMinVF, estimateElementCount(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isFixed(), and TTI.

◆ isInLoopReduction()

bool llvm::LoopVectorizationCostModel::isInLoopReduction ( PHINode * Phi) const
inline

Returns true if the Phi is part of an inloop reduction.

Definition at line 1389 of file LoopVectorize.cpp.

Referenced by getInstructionCost().

◆ isLegalGatherOrScatter()

bool llvm::LoopVectorizationCostModel::isLegalGatherOrScatter ( Value * V,
ElementCount VF )
inline

Returns true if the target machine can represent V as a masked gather or scatter operation.

Definition at line 1199 of file LoopVectorize.cpp.

References llvm::VectorType::get(), llvm::getLoadStoreAlignment(), llvm::getLoadStoreType(), llvm::isa(), llvm::ElementCount::isVector(), LI, and TTI.

Referenced by setCostBasedWideningDecision().

◆ isLegalMaskedLoad()

bool llvm::LoopVectorizationCostModel::isLegalMaskedLoad ( Type * DataType,
Value * Ptr,
Align Alignment,
unsigned AddressSpace ) const
inline

Returns true if the target machine supports masked load operation for the given DataType and kind of access to Ptr.

Definition at line 1191 of file LoopVectorize.cpp.

References Legal, Ptr, and TTI.

Referenced by isScalarWithPredication().

◆ isLegalMaskedStore()

bool llvm::LoopVectorizationCostModel::isLegalMaskedStore ( Type * DataType,
Value * Ptr,
Align Alignment,
unsigned AddressSpace ) const
inline

Returns true if the target machine supports masked store operation for the given DataType and kind of access to Ptr.

Definition at line 1183 of file LoopVectorize.cpp.

References Legal, Ptr, and TTI.

Referenced by isScalarWithPredication().

◆ isOptimizableIVTruncate()

bool llvm::LoopVectorizationCostModel::isOptimizableIVTruncate ( Instruction * I,
ElementCount VF )
inline

Return True if instruction I is an optimizable truncate whose operand is an induction variable.

Such a truncate will be removed by adding a new induction variable with the destination type.

Definition at line 1136 of file LoopVectorize.cpp.

References llvm::dyn_cast(), I, Legal, llvm::toVectorTy(), and TTI.

Referenced by getInstructionCost().

◆ isPredicatedInst()

bool LoopVectorizationCostModel::isPredicatedInst ( Instruction * I) const

Returns true if I is an instruction that needs to be predicated at runtime.

The result is independent of the predication mechanism. Superset of instructions that return true for isScalarWithPredication.

Definition at line 2810 of file LoopVectorize.cpp.

References assert(), llvm::cast(), foldTailByMasking(), llvm::getLoadStorePointerOperand(), I, llvm::isa(), llvm::isSafeToSpeculativelyExecute(), Legal, llvm_unreachable, and TheLoop.

Referenced by getInstructionCost(), isScalarWithPredication(), and shouldConsiderInvariant().

◆ isProfitableToScalarize()

bool llvm::LoopVectorizationCostModel::isProfitableToScalarize ( Instruction * I,
ElementCount VF ) const
inline
Returns
True if it is more profitable to scalarize instruction I for vectorization factor VF.

Definition at line 981 of file LoopVectorize.cpp.

References assert(), I, llvm::ElementCount::isVector(), and TheLoop.

Referenced by canTruncateToMinimalBitwidth(), and getInstructionCost().

◆ isScalarAfterVectorization()

bool llvm::LoopVectorizationCostModel::isScalarAfterVectorization ( Instruction * I,
ElementCount VF ) const
inline

Returns true if I is known to be scalar after vectorization.

Definition at line 1015 of file LoopVectorize.cpp.

References assert(), I, llvm::ElementCount::isScalar(), and TheLoop.

Referenced by canTruncateToMinimalBitwidth(), collectInstsToScalarize(), and getInstructionCost().

◆ isScalarEpilogueAllowed()

bool llvm::LoopVectorizationCostModel::isScalarEpilogueAllowed ( ) const
inline

Returns true if a scalar epilogue is not allowed due to optsize or a loop hint annotation.

Definition at line 1302 of file LoopVectorize.cpp.

References llvm::CM_ScalarEpilogueAllowed.

Referenced by interleavedAccessCanBeWidened(), and requiresScalarEpilogue().

◆ isScalarWithPredication()

bool LoopVectorizationCostModel::isScalarWithPredication ( Instruction * I,
ElementCount VF ) const

Returns true if I is an instruction which requires predication and for which our chosen predication strategy is scalarization (i.e.

we don't have an alternate strategy such as masking available). VF is the vectorization factor that will be used to vectorize I.

Definition at line 2768 of file LoopVectorize.cpp.

References llvm::cast(), CM_Scalarize, llvm::VectorType::get(), getCallWideningDecision(), getDivRemSpeculationCost(), llvm::getLoadStoreAddressSpace(), llvm::getLoadStoreAlignment(), llvm::getLoadStorePointerOperand(), llvm::getLoadStoreType(), I, llvm::isa(), isDivRemScalarWithPredication(), isLegalMaskedLoad(), isLegalMaskedStore(), isPredicatedInst(), llvm::ElementCount::isScalar(), llvm::ElementCount::isVector(), llvm::LoopVectorizationCostModel::CallWideningDecision::Kind, Ptr, and TTI.

Referenced by collectInstsToScalarize(), memoryInstructionCanBeWidened(), and setCostBasedWideningDecision().

◆ isUniformAfterVectorization()

bool llvm::LoopVectorizationCostModel::isUniformAfterVectorization ( Instruction * I,
ElementCount VF ) const
inline

Returns true if I is known to be uniform after vectorization.

Definition at line 995 of file LoopVectorize.cpp.

References assert(), I, llvm::isa(), llvm::ElementCount::isScalar(), and TheLoop.

Referenced by getInstructionCost(), and setVectorizedCallDecision().

◆ memoryInstructionCanBeWidened()

bool LoopVectorizationCostModel::memoryInstructionCanBeWidened ( Instruction * I,
ElementCount VF )

Returns true if I is a memory instruction with consecutive memory access that can be widened.

Definition at line 3000 of file LoopVectorize.cpp.

References assert(), DL, llvm::getLoadStorePointerOperand(), llvm::getLoadStoreType(), hasIrregularType(), I, llvm::isa(), isScalarWithPredication(), Legal, and Ptr.

Referenced by setCostBasedWideningDecision().

◆ requiresScalarEpilogue()

bool llvm::LoopVectorizationCostModel::requiresScalarEpilogue ( bool IsVectorizing) const
inline

Returns true if we're required to use a scalar epilogue for at least the final iteration of the original loop.

Definition at line 1278 of file LoopVectorize.cpp.

References llvm::dbgs(), EnableEarlyExitVectorization, InterleaveInfo, isScalarEpilogueAllowed(), Legal, LLVM_DEBUG, and TheLoop.

Referenced by collectValuesToIgnore().

◆ runtimeChecksRequired()

bool LoopVectorizationCostModel::runtimeChecksRequired ( )
Returns
True if runtime checks are required for vectorization, and false otherwise.

Definition at line 3261 of file LoopVectorize.cpp.

References llvm::dbgs(), Legal, LLVM_DEBUG, ORE, PSE, llvm::reportVectorizationFailure(), and TheLoop.

Referenced by computeMaxVF().

◆ selectUserVectorizationFactor()

bool llvm::LoopVectorizationCostModel::selectUserVectorizationFactor ( ElementCount UserVF)
inline

Setup cost-based decisions for user vectorization factor.

Returns
true if the UserVF is a feasible VF to be chosen.

Definition at line 921 of file LoopVectorize.cpp.

References collectNonVectorizedAndSetWideningDecisions(), expectedCost(), and llvm::InstructionCost::isValid().

◆ setCallWideningDecision()

void llvm::LoopVectorizationCostModel::setCallWideningDecision ( CallInst * CI,
ElementCount VF,
InstWidening Kind,
Function * Variant,
Intrinsic::ID IID,
std::optional< unsigned > MaskPos,
InstructionCost Cost )
inline

Definition at line 1116 of file LoopVectorize.cpp.

References assert(), and llvm::ElementCount::isScalar().

Referenced by setVectorizedCallDecision().

◆ setCostBasedWideningDecision()

void LoopVectorizationCostModel::setCostBasedWideningDecision ( ElementCount VF)

Memory access instruction may be vectorized in more than one way.

Form of instruction after vectorization depends on cost. This function takes cost-based decisions for Load/Store instructions and collects them in a map. This decisions map is used for building the lists of loop-uniform and loop-scalar instructions. The calculated cost is saved with widening decision in order to avoid redundant calculations.

Definition at line 5541 of file LoopVectorize.cpp.

References llvm::append_range(), assert(), llvm::cast(), CM_GatherScatter, CM_Interleave, CM_Scalarize, CM_Unknown, CM_Widen, CM_Widen_Reverse, llvm::dyn_cast(), llvm::dyn_cast_or_null(), llvm::SmallVectorTemplateCommon< T, typename >::empty(), foldTailByMasking(), llvm::ElementCount::getFixed(), getInterleavedAccessGroup(), llvm::InstructionCost::getInvalid(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), llvm::getLoadStorePointerOperand(), llvm::getLoadStoreType(), getWideningDecision(), I, llvm::SmallPtrSetImpl< PtrType >::insert(), interleavedAccessCanBeWidened(), llvm::isa(), isAccessInterleaved(), isLegalGatherOrScatter(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isScalable(), llvm::ElementCount::isScalar(), isScalarWithPredication(), Legal, memoryInstructionCanBeWidened(), llvm::SmallVectorImpl< T >::pop_back_val(), Ptr, llvm::SmallVectorTemplateBase< T, bool >::push_back(), setWideningDecision(), TheLoop, and TTI.

Referenced by collectNonVectorizedAndSetWideningDecisions().

◆ setTailFoldingStyles()

void llvm::LoopVectorizationCostModel::setTailFoldingStyles ( bool IsScalableVF,
unsigned UserIC )
inline

Selects and saves TailFoldingStyle for 2 options - if IV update may overflow or not.

Parameters
IsScalableVFtrue if scalable vector factors enabled.
UserICUser specific interleave count.

Definition at line 1318 of file LoopVectorize.cpp.

References assert(), llvm::CM_ScalarEpilogueAllowed, llvm::CM_ScalarEpilogueNotNeededUsePredicate, llvm::DataWithEVL, llvm::DataWithoutLaneMask, llvm::dbgs(), llvm::EnableVPlanNativePath, ForceTailFoldingStyle, Legal, LLVM_DEBUG, llvm::None, and TTI.

Referenced by computeMaxVF().

◆ setVectorizedCallDecision()

void LoopVectorizationCostModel::setVectorizedCallDecision ( ElementCount VF)

A call may be vectorized in different ways depending on whether we have vectorized variants available and whether the target supports masking.

This function analyzes all calls in the function at the supplied VF, makes a decision based on the costs of available options, and stores that decision in a map for use in planning and plan execution.

Definition at line 5733 of file LoopVectorize.cpp.

References llvm::CallBase::args(), assert(), CM_IntrinsicCall, CM_Scalarize, CM_VectorCall, CostKind, llvm::dyn_cast(), llvm::CallBase::getArgOperand(), llvm::CallBase::getCalledFunction(), llvm::Module::getFunction(), llvm::InstructionCost::getInvalid(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::getKnownMinValue(), llvm::VFDatabase::getMappings(), llvm::Instruction::getModule(), llvm::VFInfo::getParamIndexForOptionalMask(), getReductionPatternCost(), llvm::ScalarEvolution::getSCEV(), llvm::Value::getType(), getVectorIntrinsicCost(), llvm::getVectorIntrinsicIDForCall(), llvm::GlobalPredicate, I, IntrinsicCost, llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isFixed(), llvm::RecurrenceDescriptor::isFMulAddIntrinsic(), llvm::CallBase::isNoBuiltin(), llvm::ElementCount::isScalar(), isUniformAfterVectorization(), isValid(), llvm::ElementCount::isVector(), Legal, llvm::SCEVPatternMatch::m_SCEV(), llvm::SCEVPatternMatch::m_scev_AffineAddRec(), llvm::SCEVPatternMatch::m_scev_SpecificSInt(), llvm::SCEVPatternMatch::m_SpecificLoop(), llvm::PatternMatch::match(), llvm::Intrinsic::not_intrinsic, llvm::OMP_Linear, llvm::OMP_Uniform, PSE, llvm::SmallVectorTemplateBase< T, bool >::push_back(), setCallWideningDecision(), TheLoop, TLI, llvm::toVectorizedTy(), TTI, and llvm::Vector.

Referenced by collectNonVectorizedAndSetWideningDecisions().

◆ setWideningDecision() [1/2]

void llvm::LoopVectorizationCostModel::setWideningDecision ( const InterleaveGroup< Instruction > * Grp,
ElementCount VF,
InstWidening W,
InstructionCost Cost )
inline

Save vectorization decision W and Cost taken by the cost model for interleaving group Grp and vector width VF.

Broadcast this decicion to all instructions inside the group. When interleaving, the cost will only be assigned one instruction, the insert position. For other cases, add the appropriate fraction of the total cost to each instruction. This ensures accurate costs are used, even if the insert position instruction is not used.

Definition at line 1058 of file LoopVectorize.cpp.

References assert(), CM_Interleave, llvm::InterleaveGroup< InstTy >::getFactor(), llvm::InterleaveGroup< InstTy >::getInsertPos(), llvm::InterleaveGroup< InstTy >::getMember(), llvm::InterleaveGroup< InstTy >::getNumMembers(), I, and llvm::ElementCount::isVector().

◆ setWideningDecision() [2/2]

void llvm::LoopVectorizationCostModel::setWideningDecision ( Instruction * I,
ElementCount VF,
InstWidening W,
InstructionCost Cost )
inline

Save vectorization decision W and Cost taken by the cost model for instruction I and vector width VF.

Definition at line 1050 of file LoopVectorize.cpp.

References assert(), I, and llvm::ElementCount::isVector().

Referenced by setCostBasedWideningDecision().

◆ shouldConsiderInvariant()

bool LoopVectorizationCostModel::shouldConsiderInvariant ( Value * Op)

Returns true if Op should be considered invariant and if it is trivially hoistable.

Definition at line 5890 of file LoopVectorize.cpp.

References llvm::all_of(), llvm::dyn_cast(), llvm::isa(), isPredicatedInst(), Legal, and TheLoop.

Referenced by getInstructionCost().

◆ shouldConsiderRegPressureForVF()

◆ useMaxBandwidth()

bool LoopVectorizationCostModel::useMaxBandwidth ( TargetTransformInfo::RegisterKind RegKind)
Returns
True if maximizing vector bandwidth is enabled by the target or user options, for the given register kind.

Definition at line 3706 of file LoopVectorize.cpp.

References Legal, MaximizeBandwidth, TTI, and UseWiderVFIfCallVariantsPresent.

Referenced by shouldConsiderRegPressureForVF().

◆ useOrderedReductions()

bool llvm::LoopVectorizationCostModel::useOrderedReductions ( const RecurrenceDescriptor & RdxDesc) const
inline

Returns true if we should use strict in-order reductions for the given RdxDesc.

This is true if the -enable-strict-reductions flag is passed, the IsOrdered flag of RdxDesc is set and we do not allow reordering of FP operations.

Definition at line 968 of file LoopVectorize.cpp.

References Hints, and llvm::RecurrenceDescriptor::isOrdered().

Referenced by collectElementTypesForWidening(), collectInLoopReductions(), and getReductionPatternCost().

◆ usePredicatedReductionSelect()

bool llvm::LoopVectorizationCostModel::usePredicatedReductionSelect ( ) const
inline

Returns true if the predicated reduction select should be used to set the incoming value for the reduction phi.

Definition at line 1395 of file LoopVectorize.cpp.

References foldTailWithEVL(), PreferPredicatedReductionSelect, and TTI.

◆ LoopVectorizationPlanner

friend class LoopVectorizationPlanner
friend

Definition at line 885 of file LoopVectorize.cpp.

References LoopVectorizationPlanner.

Referenced by LoopVectorizationPlanner.

Member Data Documentation

◆ AC

AssumptionCache* llvm::LoopVectorizationCostModel::AC

Assumption cache.

Definition at line 1705 of file LoopVectorize.cpp.

Referenced by collectValuesToIgnore(), and LoopVectorizationCostModel().

◆ CostKind

◆ DB

DemandedBits* llvm::LoopVectorizationCostModel::DB

Demanded bits analysis.

Definition at line 1702 of file LoopVectorize.cpp.

Referenced by LoopVectorizationCostModel().

◆ ElementTypesInLoop

SmallPtrSet<Type *, 16> llvm::LoopVectorizationCostModel::ElementTypesInLoop

All element types found in the loop.

Definition at line 1726 of file LoopVectorize.cpp.

Referenced by collectElementTypesForWidening(), and getSmallestAndWidestTypes().

◆ Hints

const LoopVectorizeHints* llvm::LoopVectorizationCostModel::Hints

Loop Vectorize Hint.

Definition at line 1713 of file LoopVectorize.cpp.

Referenced by LoopVectorizationCostModel(), and useOrderedReductions().

◆ InterleaveInfo

InterleavedAccessInfo& llvm::LoopVectorizationCostModel::InterleaveInfo

The interleave access information contains groups of interleaved accesses with the same stride and close to each other.

Definition at line 1717 of file LoopVectorize.cpp.

Referenced by computeMaxVF(), getInterleavedAccessGroup(), isAccessInterleaved(), LoopVectorizationCostModel(), and requiresScalarEpilogue().

◆ Legal

◆ LI

LoopInfo* llvm::LoopVectorizationCostModel::LI

Loop Info analysis.

Definition at line 1690 of file LoopVectorize.cpp.

Referenced by collectValuesToIgnore(), isLegalGatherOrScatter(), and LoopVectorizationCostModel().

◆ MaxPermissibleVFWithoutMaxBW

FixedScalableVFPair llvm::LoopVectorizationCostModel::MaxPermissibleVFWithoutMaxBW

The highest VF possible for this loop, without using MaxBandwidth.

Definition at line 1736 of file LoopVectorize.cpp.

Referenced by shouldConsiderRegPressureForVF().

◆ OptForSize

bool llvm::LoopVectorizationCostModel::OptForSize

Whether this loop should be optimized for size based on function attribute or profile information.

Definition at line 1733 of file LoopVectorize.cpp.

Referenced by LoopVectorizationCostModel().

◆ ORE

OptimizationRemarkEmitter* llvm::LoopVectorizationCostModel::ORE

Interface to emit optimization remarks.

Definition at line 1708 of file LoopVectorize.cpp.

Referenced by computeMaxVF(), LoopVectorizationCostModel(), and runtimeChecksRequired().

◆ PSE

PredicatedScalarEvolution& llvm::LoopVectorizationCostModel::PSE

Predicated scalar evolution analysis.

Definition at line 1687 of file LoopVectorize.cpp.

Referenced by computeMaxVF(), expectedCost(), getInstructionCost(), LoopVectorizationCostModel(), runtimeChecksRequired(), and setVectorizedCallDecision().

◆ TheFunction

const Function* llvm::LoopVectorizationCostModel::TheFunction

◆ TheLoop

◆ TLI

◆ TTI

◆ ValuesToIgnore

SmallPtrSet<const Value *, 16> llvm::LoopVectorizationCostModel::ValuesToIgnore

Values to ignore in the cost model.

Definition at line 1720 of file LoopVectorize.cpp.

Referenced by collectElementTypesForWidening(), collectValuesToIgnore(), and expectedCost().

◆ VecValuesToIgnore

SmallPtrSet<const Value *, 16> llvm::LoopVectorizationCostModel::VecValuesToIgnore

Values to ignore in the cost model when VF > 1.

Definition at line 1723 of file LoopVectorize.cpp.

Referenced by collectValuesToIgnore(), and expectedCost().


The documentation for this class was generated from the following file: