LLVM 20.0.0git
|
This file implements the targeting of the RegisterBankInfo class for AMDGPU. More...
#include "AMDGPURegisterBankInfo.h"
#include "AMDGPU.h"
#include "AMDGPUGlobalISelUtils.h"
#include "AMDGPUInstrInfo.h"
#include "GCNSubtarget.h"
#include "SIMachineFunctionInfo.h"
#include "SIRegisterInfo.h"
#include "llvm/CodeGen/GlobalISel/GenericMachineInstrs.h"
#include "llvm/CodeGen/GlobalISel/LegalizerHelper.h"
#include "llvm/CodeGen/GlobalISel/MIPatternMatch.h"
#include "llvm/CodeGen/GlobalISel/MachineIRBuilder.h"
#include "llvm/CodeGen/RegisterBank.h"
#include "llvm/IR/IntrinsicsAMDGPU.h"
#include "AMDGPUGenRegisterBank.inc"
#include "AMDGPUGenRegisterBankInfo.def"
Go to the source code of this file.
Macros | |
#define | GET_TARGET_REGBANK_IMPL |
Functions | |
static bool | isVectorRegisterBank (const RegisterBank &Bank) |
static void | setRegsToType (MachineRegisterInfo &MRI, ArrayRef< Register > Regs, LLT NewTy) |
Replace the current type each register in Regs has with NewTy . | |
static LLT | getHalfSizedType (LLT Ty) |
static std::pair< LLT, LLT > | splitUnequalType (LLT Ty, unsigned FirstSize) |
Split Ty into 2 pieces. | |
static LLT | widen96To128 (LLT Ty) |
static unsigned | getSBufferLoadCorrespondingBufferLoadOpcode (unsigned Opc) |
static unsigned | getExtendOp (unsigned Opc) |
static std::pair< Register, Register > | unpackV2S16ToS32 (MachineIRBuilder &B, Register Src, unsigned ExtOpcode) |
static bool | substituteSimpleCopyRegs (const AMDGPURegisterBankInfo::OperandsMapper &OpdMapper, unsigned OpIdx) |
static std::pair< Register, unsigned > | getBaseWithConstantOffset (MachineRegisterInfo &MRI, Register Reg) |
static void | reinsertVectorIndexAdd (MachineIRBuilder &B, MachineInstr &IdxUseInstr, unsigned OpIdx, unsigned ConstOffset) |
Utility function for pushing dynamic vector indexes with a constant offset into waterfall loops. | |
static void | extendLow32IntoHigh32 (MachineIRBuilder &B, Register Hi32Reg, Register Lo32Reg, unsigned ExtOpc, const RegisterBank &RegBank, bool IsBooleanSrc=false) |
Implement extending a 32-bit value to a 64-bit value. | |
static Register | constrainRegToBank (MachineRegisterInfo &MRI, MachineIRBuilder &B, Register &Reg, const RegisterBank &Bank) |
static unsigned | regBankUnion (unsigned RB0, unsigned RB1) |
static unsigned | regBankBoolUnion (unsigned RB0, unsigned RB1) |
This file implements the targeting of the RegisterBankInfo class for AMDGPU.
AMDGPU has unique register bank constraints that require special high level strategies to deal with. There are two main true physical register banks VGPR (vector), and SGPR (scalar). Additionally the VCC register bank is a sort of pseudo-register bank needed to represent SGPRs used in a vector boolean context. There is also the AGPR bank, which is a special purpose physical register bank present on some subtargets.
Copying from VGPR to SGPR is generally illegal, unless the value is known to be uniform. It is generally not valid to legalize operands by inserting copies as on other targets. Operations which require uniform, SGPR operands generally require scalarization by repeatedly executing the instruction, activating each set of lanes using a unique set of input values. This is referred to as a waterfall loop.
Booleans (s1 values) requires special consideration. A vector compare result is naturally a bitmask with one bit per lane, in a 32 or 64-bit register. These are represented with the VCC bank. During selection, we need to be able to unambiguously go back from a register class to a register bank. To distinguish whether an SGPR should use the SGPR or VCC register bank, we need to know the use context type. An SGPR s1 value always means a VCC bank value, otherwise it will be the SGPR bank. A scalar compare sets SCC, which is a 1-bit unaddressable register. This will need to be copied to a 32-bit virtual register. Taken together, this means we need to adjust the type of boolean operations to be regbank legal. All SALU booleans need to be widened to 32-bits, and all VALU booleans need to be s1 values.
A noteworthy exception to the s1-means-vcc rule is for legalization artifact casts. G_TRUNC s1 results, and G_SEXT/G_ZEXT/G_ANYEXT sources are never vcc bank. A non-boolean source (such as a truncate from a 1-bit load from memory) will require a copy to the VCC bank which will require clearing the high bits and inserting a compare.
VALU instructions have a limitation known as the constant bus restriction. Most VALU instructions can use SGPR operands, but may read at most 1 SGPR or constant literal value (this to 2 in gfx10 for most instructions). This is one unique SGPR, so the same SGPR may be used for multiple operands. From a register bank perspective, any combination of operands should be legal as an SGPR, but this is contextually dependent on the SGPR operands all being the same register. There is therefore optimal to choose the SGPR with the most uses to minimize the number of copies.
We avoid trying to solve this problem in RegBankSelect. Any VALU G_* operation should have its source operands all mapped to VGPRs (except for VCC), inserting copies from any SGPR operands. This the most trivial legal mapping. Anything beyond the simplest 1:1 instruction selection would be too complicated to solve here. Every optimization pattern or instruction selected to multiple outputs would have to enforce this rule, and there would be additional complexity in tracking this rule for every G_* operation. By forcing all inputs to VGPRs, it also simplifies the task of picking the optimal operand combination from a post-isel optimization pass.
Definition in file AMDGPURegisterBankInfo.cpp.
#define GET_TARGET_REGBANK_IMPL |
Definition at line 86 of file AMDGPURegisterBankInfo.cpp.
|
static |
Definition at line 2030 of file AMDGPURegisterBankInfo.cpp.
|
static |
Implement extending a 32-bit value to a 64-bit value.
Lo32Reg
is the original 32-bit source value (to be inserted in the low part of the combined 64-bit result), and Hi32Reg
is the high half of the combined 64-bit value.
Definition at line 1922 of file AMDGPURegisterBankInfo.cpp.
References assert(), B, and llvm::LLT::scalar().
Referenced by llvm::AMDGPURegisterBankInfo::applyMappingImpl().
|
static |
Definition at line 1807 of file AMDGPURegisterBankInfo.cpp.
References llvm::sampleprof::Base, llvm::MIPatternMatch::m_GAdd(), llvm::MIPatternMatch::m_ICst(), llvm::MIPatternMatch::m_Reg(), llvm::MIPatternMatch::mi_match(), and MRI.
Referenced by getBaseWithConstantOffset(), isConsecutiveLSLoc(), and llvm::AMDGPURegisterBankInfo::splitBufferOffsets().
Definition at line 1728 of file AMDGPURegisterBankInfo.cpp.
Referenced by llvm::AMDGPURegisterBankInfo::applyMappingImpl().
Definition at line 690 of file AMDGPURegisterBankInfo.cpp.
References assert(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::divideCoefficientBy(), llvm::LLT::getElementCount(), llvm::LLT::getElementType(), llvm::LLT::getScalarSizeInBits(), llvm::details::FixedOrScalableQuantity< LeafTy, ValueTy >::isKnownMultipleOf(), llvm::LLT::isVector(), llvm::LLT::scalar(), and llvm::LLT::scalarOrVector().
Referenced by llvm::AMDGPURegisterBankInfo::applyMappingImpl().
Definition at line 1336 of file AMDGPURegisterBankInfo.cpp.
References llvm_unreachable.
Referenced by llvm::AMDGPURegisterBankInfo::applyMappingSBufferLoad().
|
static |
Definition at line 221 of file AMDGPURegisterBankInfo.cpp.
References llvm::RegisterBank::getID().
Referenced by llvm::AMDGPURegisterBankInfo::copyCost().
Definition at line 3509 of file AMDGPURegisterBankInfo.cpp.
References regBankUnion().
Referenced by llvm::AMDGPURegisterBankInfo::getInstrMapping().
Definition at line 3494 of file AMDGPURegisterBankInfo.cpp.
Referenced by llvm::AMDGPURegisterBankInfo::getInstrMapping(), llvm::AMDGPURegisterBankInfo::getMappingType(), and regBankBoolUnion().
|
static |
Utility function for pushing dynamic vector indexes with a constant offset into waterfall loops.
Definition at line 1901 of file AMDGPURegisterBankInfo.cpp.
References llvm::Add, B, llvm::ilist_node_impl< OptionsT >::getIterator(), llvm::MachineInstr::getOperand(), llvm::MachineInstr::getParent(), llvm::MachineOperand::getReg(), MRI, S32, llvm::LLT::scalar(), and llvm::MachineOperand::setReg().
Referenced by llvm::AMDGPURegisterBankInfo::applyMappingImpl().
|
static |
Replace the current type each register in Regs
has with NewTy
.
Definition at line 682 of file AMDGPURegisterBankInfo.cpp.
References assert(), llvm::LLT::getSizeInBits(), and MRI.
Referenced by llvm::AMDGPURegisterBankInfo::applyMappingImpl(), and llvm::AMDGPURegisterBankInfo::applyMappingSMULU64().
Split Ty
into 2 pieces.
The first will have FirstSize
bits, and the rest will be in the remainder.
Definition at line 1029 of file AMDGPURegisterBankInfo.cpp.
References assert(), llvm::LLT::getElementType(), llvm::ElementCount::getFixed(), llvm::LLT::getSizeInBits(), llvm::LLT::isVector(), llvm::LLT::scalar(), and llvm::LLT::scalarOrVector().
Referenced by llvm::AMDGPURegisterBankInfo::applyMappingLoad().
|
static |
Definition at line 1768 of file AMDGPURegisterBankInfo.cpp.
References assert(), llvm::SmallVectorBase< Size_T >::empty(), llvm::RegisterBankInfo::OperandsMapper::getMI(), llvm::MachineInstr::getOperand(), llvm::RegisterBankInfo::OperandsMapper::getVRegs(), llvm::MachineOperand::setReg(), and llvm::SmallVectorBase< Size_T >::size().
Referenced by llvm::AMDGPURegisterBankInfo::applyMappingImpl().
|
static |
Definition at line 1746 of file AMDGPURegisterBankInfo.cpp.
References assert(), B, S32, and llvm::LLT::scalar().
Referenced by llvm::AMDGPURegisterBankInfo::applyMappingImpl().
Definition at line 1045 of file AMDGPURegisterBankInfo.cpp.
References assert(), llvm::LLT::fixed_vector(), llvm::LLT::getElementType(), llvm::LLT::getSizeInBits(), llvm::LLT::isVector(), and llvm::LLT::scalar().
Referenced by llvm::AMDGPURegisterBankInfo::applyMappingLoad().