LLVM 20.0.0git
LoopVectorize.h
Go to the documentation of this file.
1//===- LoopVectorize.h ------------------------------------------*- C++ -*-===//
2//
3// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4// See https://llvm.org/LICENSE.txt for license information.
5// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6//
7//===----------------------------------------------------------------------===//
8//
9// This is the LLVM loop vectorizer. This pass modifies 'vectorizable' loops
10// and generates target-independent LLVM-IR.
11// The vectorizer uses the TargetTransformInfo analysis to estimate the costs
12// of instructions in order to estimate the profitability of vectorization.
13//
14// The loop vectorizer combines consecutive loop iterations into a single
15// 'wide' iteration. After this transformation the index is incremented
16// by the SIMD vector width, and not by one.
17//
18// This pass has four parts:
19// 1. The main loop pass that drives the different parts.
20// 2. LoopVectorizationLegality - A unit that checks for the legality
21// of the vectorization.
22// 3. InnerLoopVectorizer - A unit that performs the actual
23// widening of instructions.
24// 4. LoopVectorizationCostModel - A unit that checks for the profitability
25// of vectorization. It decides on the optimal vector width, which
26// can be one, if vectorization is not profitable.
27//
28// There is a development effort going on to migrate loop vectorizer to the
29// VPlan infrastructure and to introduce outer loop vectorization support (see
30// docs/VectorizationPlan.rst and
31// http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html). For this
32// purpose, we temporarily introduced the VPlan-native vectorization path: an
33// alternative vectorization path that is natively implemented on top of the
34// VPlan infrastructure. See EnableVPlanNativePath for enabling.
35//
36//===----------------------------------------------------------------------===//
37//
38// The reduction-variable vectorization is based on the paper:
39// D. Nuzman and R. Henderson. Multi-platform Auto-vectorization.
40//
41// Variable uniformity checks are inspired by:
42// Karrenberg, R. and Hack, S. Whole Function Vectorization.
43//
44// The interleaved access vectorization is based on the paper:
45// Dorit Nuzman, Ira Rosen and Ayal Zaks. Auto-Vectorization of Interleaved
46// Data for SIMD
47//
48// Other ideas/concepts are from:
49// A. Zaks and D. Nuzman. Autovectorization in GCC-two years later.
50//
51// S. Maleki, Y. Gao, M. Garzaran, T. Wong and D. Padua. An Evaluation of
52// Vectorizing Compilers.
53//
54//===----------------------------------------------------------------------===//
55
56#ifndef LLVM_TRANSFORMS_VECTORIZE_LOOPVECTORIZE_H
57#define LLVM_TRANSFORMS_VECTORIZE_LOOPVECTORIZE_H
58
59#include "llvm/IR/PassManager.h"
61#include <functional>
62
63namespace llvm {
64
65class AssumptionCache;
66class BlockFrequencyInfo;
67class DemandedBits;
68class DominatorTree;
69class Function;
70class Instruction;
71class Loop;
72class LoopAccessInfoManager;
73class LoopInfo;
74class OptimizationRemarkEmitter;
75class ProfileSummaryInfo;
76class ScalarEvolution;
77class TargetLibraryInfo;
78class TargetTransformInfo;
79
80extern cl::opt<bool> EnableLoopInterleaving;
81extern cl::opt<bool> EnableLoopVectorization;
82
83/// A marker to determine if extra passes after loop vectorization should be
84/// run.
86 : public AnalysisInfoMixin<ShouldRunExtraVectorPasses> {
88 struct Result {
91 // Check whether the analysis has been explicitly invalidated. Otherwise,
92 // it remains preserved.
94 return !PAC.preservedWhenStateless();
95 }
96 };
97
99};
100
101/// A pass manager to run a set of extra function simplification passes after
102/// vectorization, if requested. LoopVectorize caches the
103/// ShouldRunExtraVectorPasses analysis to request extra simplifications, if
104/// they could be beneficial.
107 auto PA = PreservedAnalyses::all();
109 PA.intersect(FunctionPassManager::run(F, AM));
110 PA.abandon<ShouldRunExtraVectorPasses>();
111 return PA;
112 }
113};
114
116 /// If false, consider all loops for interleaving.
117 /// If true, only loops that explicitly request interleaving are considered.
119
120 /// If false, consider all loops for vectorization.
121 /// If true, only loops that explicitly request vectorization are considered.
123
124 /// The current defaults when creating the pass with no arguments are:
125 /// EnableLoopInterleaving = true and EnableLoopVectorization = true. This
126 /// means that interleaving default is consistent with the cl::opt flag, while
127 /// vectorization is not.
128 /// FIXME: The default for EnableLoopVectorization in the cl::opt should be
129 /// set to true, and the corresponding change to account for this be made in
130 /// opt.cpp. The initializations below will become:
131 /// InterleaveOnlyWhenForced(!EnableLoopInterleaving)
132 /// VectorizeOnlyWhenForced(!EnableLoopVectorization).
139
142 return *this;
143 }
144
147 return *this;
148 }
149};
150
151/// Storage for information about made changes.
155
158};
159
160/// The LoopVectorize Pass.
161struct LoopVectorizePass : public PassInfoMixin<LoopVectorizePass> {
162private:
163 /// If false, consider all loops for interleaving.
164 /// If true, only loops that explicitly request interleaving are considered.
165 bool InterleaveOnlyWhenForced;
166
167 /// If false, consider all loops for vectorization.
168 /// If true, only loops that explicitly request vectorization are considered.
169 bool VectorizeOnlyWhenForced;
170
171public:
173
185
188 function_ref<StringRef(StringRef)> MapClassName2PassName);
189
190 // Shim for old PM.
194 DemandedBits &DB_, AssumptionCache &AC_,
197 ProfileSummaryInfo *PSI_);
198
199 bool processLoop(Loop *L);
200};
201
202/// Reports a vectorization failure: print \p DebugMsg for debugging
203/// purposes along with the corresponding optimization remark \p RemarkName.
204/// If \p I is passed, it is an instruction that prevents vectorization.
205/// Otherwise, the loop \p TheLoop is used for the location of the remark.
206void reportVectorizationFailure(const StringRef DebugMsg,
207 const StringRef OREMsg, const StringRef ORETag,
208 OptimizationRemarkEmitter *ORE, Loop *TheLoop, Instruction *I = nullptr);
209
210} // end namespace llvm
211
212#endif // LLVM_TRANSFORMS_VECTORIZE_LOOPVECTORIZE_H
#define F(x, y, z)
Definition: MD5.cpp:55
#define I(x, y, z)
Definition: MD5.cpp:58
FunctionAnalysisManager FAM
This header defines various interfaces for pass management in LLVM.
raw_pwrite_stream & OS
API to communicate dependencies between analyses during invalidation.
Definition: PassManager.h:292
A container for analyses that lazily runs them and caches their results.
Definition: PassManager.h:253
PassT::Result * getCachedResult(IRUnitT &IR) const
Get the cached result of an analysis pass for a given IR unit.
Definition: PassManager.h:424
A cache of @llvm.assume calls within a function.
BlockFrequencyInfo pass uses BlockFrequencyInfoImpl implementation to estimate IR basic block frequen...
Concrete subclass of DominatorTreeBase that is used to compute a normal dominator tree.
Definition: Dominators.h:162
Represents a single loop in the control flow graph.
Definition: LoopInfo.h:44
The optimization diagnostic interface.
PreservedAnalyses run(Function &IR, AnalysisManager< Function > &AM, ExtraArgTs... ExtraArgs)
Run all of the passes in this manager over the given unit of IR.
A set of analyses that are preserved following a run of a transformation pass.
Definition: Analysis.h:111
static PreservedAnalyses all()
Construct a special preserved set that preserves all passes.
Definition: Analysis.h:117
PreservedAnalysisChecker getChecker() const
Build a checker for this PreservedAnalyses and the specified analysis type.
Definition: Analysis.h:264
Analysis providing profile information.
The main scalar evolution driver.
StringRef - Represent a constant reference to a string, i.e.
Definition: StringRef.h:50
Provides information about what library functions are available for the current target.
This pass provides access to the codegen interfaces that are needed for IR-level transformations.
LLVM Value Representation.
Definition: Value.h:74
An efficient, type-erasing, non-owning reference to a callable.
This class implements an extremely fast bulk output stream that can only output to a stream.
Definition: raw_ostream.h:52
This is an optimization pass for GlobalISel generic memory operations.
Definition: AddressRanges.h:18
cl::opt< bool > EnableLoopVectorization
void reportVectorizationFailure(const StringRef DebugMsg, const StringRef OREMsg, const StringRef ORETag, OptimizationRemarkEmitter *ORE, Loop *TheLoop, Instruction *I=nullptr)
Reports a vectorization failure: print DebugMsg for debugging purposes along with the corresponding o...
cl::opt< bool > EnableLoopInterleaving
A CRTP mix-in that provides informational APIs needed for analysis passes.
Definition: PassManager.h:92
A special type used by analysis passes to provide an address that identifies that particular analysis...
Definition: Analysis.h:28
A pass manager to run a set of extra function simplification passes after vectorization,...
PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM)
LoopVectorizeOptions(bool InterleaveOnlyWhenForced, bool VectorizeOnlyWhenForced)
LoopVectorizeOptions & setVectorizeOnlyWhenForced(bool Value)
LoopVectorizeOptions & setInterleaveOnlyWhenForced(bool Value)
LoopVectorizeOptions()
The current defaults when creating the pass with no arguments are: EnableLoopInterleaving = true and ...
bool InterleaveOnlyWhenForced
If false, consider all loops for interleaving.
bool VectorizeOnlyWhenForced
If false, consider all loops for vectorization.
The LoopVectorize Pass.
TargetLibraryInfo * TLI
ProfileSummaryInfo * PSI
LoopAccessInfoManager * LAIs
void printPipeline(raw_ostream &OS, function_ref< StringRef(StringRef)> MapClassName2PassName)
BlockFrequencyInfo * BFI
ScalarEvolution * SE
AssumptionCache * AC
LoopVectorizeResult runImpl(Function &F, ScalarEvolution &SE_, LoopInfo &LI_, TargetTransformInfo &TTI_, DominatorTree &DT_, BlockFrequencyInfo *BFI_, TargetLibraryInfo *TLI_, DemandedBits &DB_, AssumptionCache &AC_, LoopAccessInfoManager &LAIs_, OptimizationRemarkEmitter &ORE_, ProfileSummaryInfo *PSI_)
PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM)
OptimizationRemarkEmitter * ORE
TargetTransformInfo * TTI
Storage for information about made changes.
LoopVectorizeResult(bool MadeAnyChange, bool MadeCFGChange)
A CRTP mix-in to automatically provide informational APIs needed for passes.
Definition: PassManager.h:69
bool invalidate(Function &F, const PreservedAnalyses &PA, FunctionAnalysisManager::Invalidator &)
Definition: LoopVectorize.h:89
A marker to determine if extra passes after loop vectorization should be run.
Definition: LoopVectorize.h:86
Result run(Function &F, FunctionAnalysisManager &FAM)
Definition: LoopVectorize.h:98