Go to the documentation of this file.
24 #define DEBUG_TYPE "amdgpu-annotate-uniform"
39 I->setMetadata(
"amdgpu.uniform",
MDNode::get(
I->getContext(), {}));
44 I->setMetadata(
"amdgpu.noclobber",
MDNode::get(
I->getContext(), {}));
50 AMDGPUAnnotateUniformValues() :
52 bool doInitialization(
Module &
M)
override;
55 return "AMDGPU Annotate Uniform Values";
71 "Add AMDGPU uniform metadata",
false,
false)
78 char AMDGPUAnnotateUniformValues::
ID = 0;
80 void AMDGPUAnnotateUniformValues::visitBranchInst(
BranchInst &
I) {
81 if (
DA->isUniform(&
I))
82 setUniformMetadata(&
I);
85 void AMDGPUAnnotateUniformValues::visitLoadInst(
LoadInst &
I) {
86 Value *Ptr =
I.getPointerOperand();
87 if (!
DA->isUniform(Ptr))
91 setUniformMetadata(PtrI);
100 setNoClobberMetadata(&
I);
103 bool AMDGPUAnnotateUniformValues::doInitialization(
Module &M) {
111 DA = &getAnalysis<LegacyDivergenceAnalysis>();
112 MSSA = &getAnalysis<MemorySSAWrapperPass>().getMSSA();
113 AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();
123 return new AMDGPUAnnotateUniformValues();
This is an optimization pass for GlobalISel generic memory operations.
We currently emits eax Perhaps this is what we really should generate is Is imull three or four cycles eax eax The current instruction priority is based on pattern complexity The former is more complex because it folds a load so the latter will not be emitted Perhaps we should use AddedComplexity to give LEA32r a higher priority We should always try to match LEA first since the LEA matching code does some estimate to determine whether the match is profitable if we care more about code then imull is better It s two bytes shorter than movl leal On a Pentium M
static MDTuple * get(LLVMContext &Context, ArrayRef< Metadata * > MDs)
Legacy analysis pass which computes MemorySSA.
Represent the analysis usage information of a pass.
FunctionPass * createAMDGPUAnnotateUniformValues()
bool isClobberedInFunction(const LoadInst *Load, MemorySSA *MSSA, AAResults *AA)
Check is a Load is clobbered in its function.
unsigned ID
LLVM IR allows to use arbitrary numbers as calling convention identifiers.
bool isEntryFunctionCC(CallingConv::ID CC)
#define INITIALIZE_PASS_END(passName, arg, name, cfg, analysis)
T uniform(GenT &Gen, T Min, T Max)
Return a uniformly distributed random value between Min and Max.
INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
Encapsulates MemorySSA, including all data associated with memory accesses.
A Module instance is used to store all the information related to an LLVM module.
StringRef - Represent a constant reference to a string, i.e.
@ GLOBAL_ADDRESS
Address space for global memory (RAT0, VTX0).
Base class for instruction visitors.
An instruction for reading from memory.
static bool runOnFunction(Function &F, bool PostInlining)
void setPreservesAll()
Set by analyses that do not transform their input at all.
A wrapper pass to provide the legacy pass manager access to a suitably prepared AAResults object.
FunctionPass class - This class is used to implement most global optimizations.
AnalysisUsage & addRequired()
Conditional or Unconditional Branch instruction.
LLVM Value Representation.