LLVM 18.0.0git
FunctionSpecialization.h
Go to the documentation of this file.
1//===- FunctionSpecialization.h - Function Specialization -----------------===//
2//
3// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4// See https://llvm.org/LICENSE.txt for license information.
5// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6//
7//===----------------------------------------------------------------------===//
8//
9// Overview:
10// ---------
11// Function Specialization is a transformation which propagates the constant
12// parameters of a function call from the caller to the callee. It is part of
13// the Inter-Procedural Sparse Conditional Constant Propagation (IPSCCP) pass.
14// The transformation runs iteratively a number of times which is controlled
15// by the option `funcspec-max-iters`. Running it multiple times is needed
16// for specializing recursive functions, but also exposes new opportunities
17// arising from specializations which return constant values or contain calls
18// which can be specialized.
19//
20// Function Specialization supports propagating constant parameters like
21// function pointers, literal constants and addresses of global variables.
22// By propagating function pointers, indirect calls become direct calls. This
23// exposes inlining opportunities which we would have otherwise missed. That's
24// why function specialization is run before the inliner in the optimization
25// pipeline; that is by design.
26//
27// Cost Model:
28// -----------
29// The cost model facilitates a utility for estimating the specialization bonus
30// from propagating a constant argument. This is the InstCostVisitor, a class
31// that inherits from the InstVisitor. The bonus itself is expressed as codesize
32// and latency savings. Codesize savings means the amount of code that becomes
33// dead in the specialization from propagating the constant, whereas latency
34// savings represents the cycles we are saving from replacing instructions with
35// constant values. The InstCostVisitor overrides a set of `visit*` methods to
36// be able to handle different types of instructions. These attempt to constant-
37// fold the instruction in which case a constant is returned and propagated
38// further.
39//
40// Function pointers are not handled by the InstCostVisitor. They are treated
41// separately as they could expose inlining opportunities via indirect call
42// promotion. The inlining bonus contributes to the total specialization score.
43//
44// For a specialization to be profitable its bonus needs to exceed a minimum
45// threshold. There are three options for controlling the threshold which are
46// expressed as percentages of the original function size:
47// * funcspec-min-codesize-savings
48// * funcspec-min-latency-savings
49// * funcspec-min-inlining-bonus
50// There's also an option for controlling the codesize growth from recursive
51// specializations. That is `funcspec-max-codesize-growth`.
52//
53// Once we have all the potential specializations with their score we need to
54// choose the best ones, which fit in the module specialization budget. That
55// is controlled by the option `funcspec-max-clones`. To find the best `NSpec`
56// specializations we use a max-heap. For more details refer to D139346.
57//
58// Ideas:
59// ------
60// - With a function specialization attribute for arguments, we could have
61// a direct way to steer function specialization, avoiding the cost-model,
62// and thus control compile-times / code-size.
63//
64// - Perhaps a post-inlining function specialization pass could be more
65// aggressive on literal constants.
66//
67// References:
68// -----------
69// 2021 LLVM Dev Mtg “Introducing function specialisation, and can we enable
70// it by default?”, https://www.youtube.com/watch?v=zJiCjeXgV5Q
71//
72//===----------------------------------------------------------------------===//
73
74#ifndef LLVM_TRANSFORMS_IPO_FUNCTIONSPECIALIZATION_H
75#define LLVM_TRANSFORMS_IPO_FUNCTIONSPECIALIZATION_H
76
81#include "llvm/IR/InstVisitor.h"
86
87using namespace llvm;
88
89namespace llvm {
90// Map of potential specializations for each function. The FunctionSpecializer
91// keeps the discovered specialisation opportunities for the module in a single
92// vector, where the specialisations of each function form a contiguous range.
93// This map's value is the beginning and the end of that range.
95
96// Just a shorter abbreviation to improve indentation.
98
99// Map of known constants found during the specialization bonus estimation.
101
102// Specialization signature, used to uniquely designate a specialization within
103// a function.
104struct SpecSig {
105 // Hashing support, used to distinguish between ordinary, empty, or tombstone
106 // keys.
107 unsigned Key = 0;
109
110 bool operator==(const SpecSig &Other) const {
111 if (Key != Other.Key)
112 return false;
113 return Args == Other.Args;
114 }
115
116 friend hash_code hash_value(const SpecSig &S) {
117 return hash_combine(hash_value(S.Key),
118 hash_combine_range(S.Args.begin(), S.Args.end()));
119 }
120};
121
122// Specialization instance.
123struct Spec {
124 // Original function.
126
127 // Cloned function, a specialized version of the original one.
128 Function *Clone = nullptr;
129
130 // Specialization signature.
132
133 // Profitability of the specialization.
134 unsigned Score;
135
136 // List of call sites, matching this specialization.
138
139 Spec(Function *F, const SpecSig &S, unsigned Score)
140 : F(F), Sig(S), Score(Score) {}
141 Spec(Function *F, const SpecSig &&S, unsigned Score)
142 : F(F), Sig(S), Score(Score) {}
143};
144
145struct Bonus {
146 unsigned CodeSize = 0;
147 unsigned Latency = 0;
148
149 Bonus() = default;
150
152 int64_t Sz = *CodeSize.getValue();
153 int64_t Ltc = *Latency.getValue();
154
155 assert(Sz >= 0 && Ltc >= 0 && "CodeSize and Latency cannot be negative");
156 // It is safe to down cast since we know the arguments
157 // cannot be negative and Cost is of type int64_t.
158 this->CodeSize = static_cast<unsigned>(Sz);
159 this->Latency = static_cast<unsigned>(Ltc);
160 }
161
162 Bonus &operator+=(const Bonus RHS) {
163 CodeSize += RHS.CodeSize;
164 Latency += RHS.Latency;
165 return *this;
166 }
167
168 Bonus operator+(const Bonus RHS) const {
169 return Bonus(CodeSize + RHS.CodeSize, Latency + RHS.Latency);
170 }
171
172 bool operator==(const Bonus RHS) const {
173 return CodeSize == RHS.CodeSize && Latency == RHS.Latency;
174 }
175};
176
177class InstCostVisitor : public InstVisitor<InstCostVisitor, Constant *> {
178 const DataLayout &DL;
181 SCCPSolver &Solver;
182
183 ConstMap KnownConstants;
184 // Basic blocks known to be unreachable after constant propagation.
185 DenseSet<BasicBlock *> DeadBlocks;
186 // PHI nodes we have visited before.
187 DenseSet<Instruction *> VisitedPHIs;
188 // PHI nodes we have visited once without successfully constant folding them.
189 // Once the InstCostVisitor has processed all the specialization arguments,
190 // it should be possible to determine whether those PHIs can be folded
191 // (some of their incoming values may have become constant or dead).
192 SmallVector<Instruction *> PendingPHIs;
193
194 ConstMap::iterator LastVisited;
195
196public:
199 : DL(DL), BFI(BFI), TTI(TTI), Solver(Solver) {}
200
202 return Solver.isBlockExecutable(BB) && !DeadBlocks.contains(BB);
203 }
204
206
208
209private:
210 friend class InstVisitor<InstCostVisitor, Constant *>;
211
212 static bool canEliminateSuccessor(BasicBlock *BB, BasicBlock *Succ,
213 DenseSet<BasicBlock *> &DeadBlocks);
214
215 Bonus getUserBonus(Instruction *User, Value *Use = nullptr,
216 Constant *C = nullptr);
217
218 Cost estimateBasicBlocks(SmallVectorImpl<BasicBlock *> &WorkList);
219 Cost estimateSwitchInst(SwitchInst &I);
220 Cost estimateBranchInst(BranchInst &I);
221
222 Constant *visitInstruction(Instruction &I) { return nullptr; }
223 Constant *visitPHINode(PHINode &I);
224 Constant *visitFreezeInst(FreezeInst &I);
225 Constant *visitCallBase(CallBase &I);
226 Constant *visitLoadInst(LoadInst &I);
227 Constant *visitGetElementPtrInst(GetElementPtrInst &I);
228 Constant *visitSelectInst(SelectInst &I);
229 Constant *visitCastInst(CastInst &I);
230 Constant *visitCmpInst(CmpInst &I);
231 Constant *visitUnaryOperator(UnaryOperator &I);
232 Constant *visitBinaryOperator(BinaryOperator &I);
233};
234
236
237 /// The IPSCCP Solver.
238 SCCPSolver &Solver;
239
240 Module &M;
241
242 /// Analysis manager, needed to invalidate analyses.
244
245 /// Analyses used to help determine if a function should be specialized.
246 std::function<BlockFrequencyInfo &(Function &)> GetBFI;
247 std::function<const TargetLibraryInfo &(Function &)> GetTLI;
248 std::function<TargetTransformInfo &(Function &)> GetTTI;
249 std::function<AssumptionCache &(Function &)> GetAC;
250
251 SmallPtrSet<Function *, 32> Specializations;
252 SmallPtrSet<Function *, 32> FullySpecialized;
253 DenseMap<Function *, CodeMetrics> FunctionMetrics;
254 DenseMap<Function *, unsigned> FunctionGrowth;
255 unsigned NGlobals = 0;
256
257public:
259 SCCPSolver &Solver, Module &M, FunctionAnalysisManager *FAM,
260 std::function<BlockFrequencyInfo &(Function &)> GetBFI,
261 std::function<const TargetLibraryInfo &(Function &)> GetTLI,
262 std::function<TargetTransformInfo &(Function &)> GetTTI,
263 std::function<AssumptionCache &(Function &)> GetAC)
264 : Solver(Solver), M(M), FAM(FAM), GetBFI(GetBFI), GetTLI(GetTLI),
265 GetTTI(GetTTI), GetAC(GetAC) {}
266
268
269 bool run();
270
272 auto &BFI = GetBFI(*F);
273 auto &TTI = GetTTI(*F);
274 return InstCostVisitor(M.getDataLayout(), BFI, TTI, Solver);
275 }
276
277private:
278 Constant *getPromotableAlloca(AllocaInst *Alloca, CallInst *Call);
279
280 /// A constant stack value is an AllocaInst that has a single constant
281 /// value stored to it. Return this constant if such an alloca stack value
282 /// is a function argument.
283 Constant *getConstantStackValue(CallInst *Call, Value *Val);
284
285 /// See if there are any new constant values for the callers of \p F via
286 /// stack variables and promote them to global variables.
287 void promoteConstantStackValues(Function *F);
288
289 /// Clean up fully specialized functions.
290 void removeDeadFunctions();
291
292 /// Remove any ssa_copy intrinsics that may have been introduced.
293 void cleanUpSSA();
294
295 /// @brief Find potential specialization opportunities.
296 /// @param F Function to specialize
297 /// @param FuncSize Cost of specializing a function.
298 /// @param AllSpecs A vector to add potential specializations to.
299 /// @param SM A map for a function's specialisation range
300 /// @return True, if any potential specializations were found
301 bool findSpecializations(Function *F, unsigned FuncSize,
302 SmallVectorImpl<Spec> &AllSpecs, SpecMap &SM);
303
304 /// Compute the inlining bonus for replacing argument \p A with constant \p C.
305 unsigned getInliningBonus(Argument *A, Constant *C);
306
307 bool isCandidateFunction(Function *F);
308
309 /// @brief Create a specialization of \p F and prime the SCCPSolver
310 /// @param F Function to specialize
311 /// @param S Which specialization to create
312 /// @return The new, cloned function
313 Function *createSpecialization(Function *F, const SpecSig &S);
314
315 /// Determine if it is possible to specialise the function for constant values
316 /// of the formal parameter \p A.
317 bool isArgumentInteresting(Argument *A);
318
319 /// Check if the value \p V (an actual argument) is a constant or can only
320 /// have a constant value. Return that constant.
321 Constant *getCandidateConstant(Value *V);
322
323 /// @brief Find and update calls to \p F, which match a specialization
324 /// @param F Orginal function
325 /// @param Begin Start of a range of possibly matching specialisations
326 /// @param End End of a range (exclusive) of possibly matching specialisations
327 void updateCallSites(Function *F, const Spec *Begin, const Spec *End);
328};
329} // namespace llvm
330
331#endif // LLVM_TRANSFORMS_IPO_FUNCTIONSPECIALIZATION_H
MachineBasicBlock MachineBasicBlock::iterator DebugLoc DL
static GCRegistry::Add< ErlangGC > A("erlang", "erlang-compatible garbage collector")
bool End
Definition: ELF_riscv.cpp:469
#define F(x, y, z)
Definition: MD5.cpp:55
#define I(x, y, z)
Definition: MD5.cpp:58
FunctionAnalysisManager FAM
assert(ImpDefSCC.getReg()==AMDGPU::SCC &&ImpDefSCC.isDef())
This pass exposes codegen information to IR-level passes.
Value * RHS
an instruction to allocate memory on the stack
Definition: Instructions.h:58
A container for analyses that lazily runs them and caches their results.
Definition: PassManager.h:620
This class represents an incoming formal argument to a Function.
Definition: Argument.h:28
A cache of @llvm.assume calls within a function.
LLVM Basic Block Representation.
Definition: BasicBlock.h:56
BlockFrequencyInfo pass uses BlockFrequencyInfoImpl implementation to estimate IR basic block frequen...
Conditional or Unconditional Branch instruction.
Base class for all callable instructions (InvokeInst and CallInst) Holds everything related to callin...
Definition: InstrTypes.h:1190
This class represents a function call, abstracting a target machine's calling convention.
This is the base class for all instructions that perform data casts.
Definition: InstrTypes.h:428
This class is the base class for the comparison instructions.
Definition: InstrTypes.h:701
This is an important base class in LLVM.
Definition: Constant.h:41
A parsed version of the target data layout string in and methods for querying it.
Definition: DataLayout.h:110
Implements a dense probed hash-table based set.
Definition: DenseSet.h:271
This class represents a freeze function that returns random concrete value if an operand is either a ...
bool run()
Attempt to specialize functions in the module to enable constant propagation across function boundari...
FunctionSpecializer(SCCPSolver &Solver, Module &M, FunctionAnalysisManager *FAM, std::function< BlockFrequencyInfo &(Function &)> GetBFI, std::function< const TargetLibraryInfo &(Function &)> GetTLI, std::function< TargetTransformInfo &(Function &)> GetTTI, std::function< AssumptionCache &(Function &)> GetAC)
InstCostVisitor getInstCostVisitorFor(Function *F)
an instruction for type-safe pointer arithmetic to access elements of arrays and structs
Definition: Instructions.h:940
bool isBlockExecutable(BasicBlock *BB)
Bonus getSpecializationBonus(Argument *A, Constant *C)
Compute a bonus for replacing argument A with constant C.
InstCostVisitor(const DataLayout &DL, BlockFrequencyInfo &BFI, TargetTransformInfo &TTI, SCCPSolver &Solver)
Base class for instruction visitors.
Definition: InstVisitor.h:78
An instruction for reading from memory.
Definition: Instructions.h:177
A Module instance is used to store all the information related to an LLVM module.
Definition: Module.h:65
SCCPSolver - This interface class is a general purpose solver for Sparse Conditional Constant Propaga...
Definition: SCCPSolver.h:65
bool isBlockExecutable(BasicBlock *BB) const
This class represents the LLVM 'select' instruction.
SmallPtrSet - This class implements a set which is optimized for holding SmallSize or less elements.
Definition: SmallPtrSet.h:451
This class consists of common code factored out of the SmallVector class to reduce code duplication b...
Definition: SmallVector.h:577
This is a 'vector' (really, a variable-sized array), optimized for the case when the array is small.
Definition: SmallVector.h:1200
Multiway switch.
Provides information about what library functions are available for the current target.
This pass provides access to the codegen interfaces that are needed for IR-level transformations.
A Use represents the edge between a Value definition and its users.
Definition: Use.h:43
LLVM Value Representation.
Definition: Value.h:74
bool contains(const_arg_type_t< ValueT > V) const
Check if the set contains the given element.
Definition: DenseSet.h:185
An opaque object representing a hash code.
Definition: Hashing.h:74
@ C
The default llvm calling convention, compatible with C.
Definition: CallingConv.h:34
This is an optimization pass for GlobalISel generic memory operations.
Definition: AddressRanges.h:18
@ Other
Any other memory.
hash_code hash_combine(const Ts &...args)
Combine values into a single hash_code.
Definition: Hashing.h:613
hash_code hash_combine_range(InputIteratorT first, InputIteratorT last)
Compute a hash_code for a sequence of values.
Definition: Hashing.h:491
bool operator==(const Bonus RHS) const
Bonus(Cost CodeSize, Cost Latency)
Bonus operator+(const Bonus RHS) const
Bonus & operator+=(const Bonus RHS)
Bonus()=default
SmallVector< ArgInfo, 4 > Args
bool operator==(const SpecSig &Other) const
friend hash_code hash_value(const SpecSig &S)
SmallVector< CallBase * > CallSites
Spec(Function *F, const SpecSig &S, unsigned Score)
Spec(Function *F, const SpecSig &&S, unsigned Score)