LLVM 22.0.0git
|
This pass performs the peephole optimizations before code emission. More...
#include "AMDGPU.h"
#include "GCNSubtarget.h"
#include "MCTargetDesc/AMDGPUMCTargetDesc.h"
#include "llvm/ADT/SetVector.h"
#include "llvm/CodeGen/MachineFunctionPass.h"
#include "llvm/CodeGen/TargetSchedule.h"
#include "llvm/Support/BranchProbability.h"
Go to the source code of this file.
Macros | |
#define | DEBUG_TYPE "si-pre-emit-peephole" |
This pass performs the peephole optimizations before code emission.
Additionally, this pass also unpacks packed instructions (V_PK_MUL_F32/F16, V_PK_ADD_F32/F16, V_PK_FMA_F32) adjacent to MFMAs such that they can be co-issued. This helps with overlapping MFMA and certain vector instructions in machine schedules and is expected to improve performance. Only those packed instructions are unpacked that are overlapped by the MFMA latency. Rest should remain untouched. TODO: Add support for F16 packed instructions
Definition in file SIPreEmitPeephole.cpp.
#define DEBUG_TYPE "si-pre-emit-peephole" |
Definition at line 31 of file SIPreEmitPeephole.cpp.