LLVM  13.0.0git
Functions | Variables
lib/Target/PowerPC/README.txt File Reference
#include <stdlib.h>
Include dependency graph for README.txt:

Functions

 if (sum+x< x) z++
 
bb420 i The CBE manages to mtctr r0 r11 stbx r9 addi bdz later b loop This could be much better (bdnz instead of bdz) but it still beats us. If we produced this with bdnz
 
bb420 i The CBE manages to mtctr r0 r11 stbx r9 addi bdz later b loop This could be much the loop would be a single dispatch and reference pieces of it as offsets from the start For functions like this (contrived to have lots of constants obviously)
 
We ha16 (.CPI_X_0) lfd f0
 
We lo16 (.CPI_X_0)(r2) lis r2
 
We f2 lis f2 blr It would be better to materialize CPI_X into a then use immediates off of the register to avoid the lis s This is even more important in PIC mode Note that this (and the static variable version) is discussed here for GCC
 
cond_true lis ha16 (LCPI1_0) lfs f0
 
cond_true lis lo16() LCPI1_0 (r2) lis r2
 
cond_true lis lo16() ha16 (LCPI1_1) lis r3
 
cond_true lis lo16() ha16 (LCPI1_2) lfs f2
 
cond_true lis lo16() lo16() LCPI1_2 (r3) lfs f3
 
cond_true lis lo16() lo16() lo16() LCPI1_1 (r2) fsub f0
 
cond_true lis lo16() lo16() lo16() f1 fsel f3 allowing the address of the struct to be CSE d avoiding PIC accesses(also reduces the size of the GOT on targets with one). Note that this is discussed here for GCC void bar (int b)
 
void foo (unsigned char *c)
 
So that ha16 (_a) la r2
 
So that lo16() _a (r2) lbz r2
 
So that lo16() r2 stb r3 blr Becomes ha16 (_a+3) lbz r2
 
So that lo16() r2 stb r3 blr Becomes lo16 (_a+3)(r2) stb r2
 
entry stw r5 blr GCC r3 srawi xor r4 subf r0 stw r5 blr which is much nicer This theoretically may help improve twolf slightly (used in dimbox.c:142?).
 
entry mflr r11 ***stw r1 bl L00000 $pb L00000 ha16 (.CPI_foo_0-"L00000$pb") lfs f0
 
entry mflr r11 ***stw r1 bl L00000 $pb L00000 lo16 (.CPI_foo_0-"L00000$pb")(r2) fadds f1
 

Variables

TODO __pad0__
 
TODO unsigned x
 
return z
 
Should compile to something like
 
Should compile to something r3
 
Should compile to something r4 addze r3 instead we get
 
Should compile to something r4 addze r3 instead we r4
 
Should compile to something r4 addze r3 instead we r3 cmplw cr7
 
rlwinm add r4 Ick
 
rlwinm add r4 b LBB1_84
 
bb432 i LBB1_83
 
bb420 i lbzx r8
 
bb420 i lbzx r5
 
bb420 i lbzx r7 addi r6
 
bb420 i lbzx r7 addi r7
 
bb432 i mr r6 cmplwi cr0
 
bb420 i The CBE manages to produce
 
bb420 i The CBE manages to mtctr r0 loop
 
bb420 i The CBE manages to mtctr r0 r2
 
bb420 i The CBE manages to mtctr r0 r11 stbx r0
 
bb420 i The CBE manages to mtctr r0 r11 stbx r9 addi bdz later b loop This could be much the loop would be a single dispatch group
 
We generate
 
We f1
 
We f0
 
We f2 lis f2 blr It would be better to materialize CPI_X into a register
 
it produces a BB like this
 
cond_true lis lo16() lo16() lo16() f1 fsel f2
 
cond_true lis lo16() lo16() lo16() f1 fsel f3 blr
 
So that _foo
 
So that lo16() r2 stb r3 blr Becomes r3 they should compile to something better than
 
So that lo16() r2 stb r3 blr Becomes r3 they should compile to something better r3 subfic cmpwi bgt LBB2_2
 
entry LBB2_1
 
entry stw r5 blr GCC produces
 
entry stw r5 blr GCC r3 srawi xor r4 subf r0 stw r5 blr which is much nicer This theoretically may help improve twolf li blt LBB1_2
 
bb __pad1__
 
entry mr r2 blr This could be reduced to the much simpler
 
entry mr r2 blr This could be reduced to the much andc r2 r3 slwi or r2 rlwimi stw r3 blr We could collapse a bunch of those ORs and ANDs and generate the following equivalent code
 
entry mflr r11 ***stw r11
 
entry mflr r11 ***stw r1 bl L00000 $pb L00000 $pb
 
entry mflr r11 ***stw r1 bl L00000 $pb L00000 f0 ***lwz r1 mtlr r11 blr This is functional
 

Function Documentation

◆ _a()

So that lo16() _a ( r2  )

◆ bar()

cond_true lis lo16() lo16() lo16() f1 fsel f3 allowing the address of the struct to be CSE d avoiding PIC accesses (also reduces the size of the GOT on targets with one). Note that this is discussed here for GCC void bar ( int  b)

Definition at line 124 of file README.txt.

◆ better()

bb420 i The CBE manages to mtctr r0 r11 stbx r9 addi bdz later b loop This could be much better ( bdnz instead of  bdz)

◆ foo()

void foo ( unsigned char *  c)

Definition at line 125 of file README.txt.

◆ ha16() [1/7]

entry mflr r11*** stw r1 bl L00000 $pb L00000 ha16 ( .CPI_foo_0-"L00000$pb"  )

◆ ha16() [2/7]

We f2 lis ha16 ( CPI_X_0)

◆ ha16() [3/7]

So that ha16 ( _a  )

◆ ha16() [4/7]

So that lo16() r2 stb r3 blr Becomes ha16 ( _a 3)

◆ ha16() [5/7]

cond_true lis ha16 ( LCPI1_0  )

◆ ha16() [6/7]

cond_true lis lo16() ha16 ( LCPI1_1  )

◆ ha16() [7/7]

cond_true lis lo16() ha16 ( LCPI1_2  )

◆ if()

if ( )

Definition at line 176 of file README.txt.

◆ LCPI1_0()

cond_true lis lo16() LCPI1_0 ( r2  )

◆ LCPI1_1()

cond_true lis lo16() lo16() lo16() LCPI1_1 ( r2  )

◆ LCPI1_2()

cond_true lis lo16() lo16() LCPI1_2 ( r3  )

◆ lo16() [1/3]

entry mflr r11*** stw r1 bl L00000 $pb L00000 lo16 ( .CPI_foo_0-"L00000$pb"  )

◆ lo16() [2/3]

We f2 lis lo16 ( CPI_X_0)

◆ lo16() [3/3]

So that lo16() r2 stb r3 blr Becomes lo16 ( _a 3)

◆ slightly()

entry stw r5 blr GCC r3 srawi xor r4 subf r0 stw r5 blr which is much nicer This theoretically may help improve twolf slightly ( used in dimbox.c:142?  )

◆ this() [1/2]

We f2 lis f2 blr It would be better to materialize CPI_X into a then use immediates off of the register to avoid the lis s This is even more important in PIC mode Note that this ( and the static variable  version)

Definition at line 88 of file README.txt.

◆ this() [2/2]

bb420 i The CBE manages to mtctr r0 r11 stbx r9 addi bdz later b loop This could be much the loop would be a single dispatch and reference pieces of it as offsets from the start For functions like this ( contrived to have lots of constants  obviously)

Definition at line 64 of file README.txt.

References Y.

Variable Documentation

◆ $pb

entry mflr r11*** stw r1 bl L00000 $pb L00000 $pb

Definition at line 304 of file README.txt.

◆ __pad0__

TODO __pad0__

Definition at line 10 of file README.txt.

◆ __pad1__

bb __pad1__

Definition at line 201 of file README.txt.

◆ _foo

So that lo16() r2 stb r3 blr Becomes _foo

Definition at line 132 of file README.txt.

◆ blr

entry mr r2 blr This could be reduced to the much andc r2 r3 slwi or r2 rlwimi stw r3 blr We could collapse a bunch of those ORs and ANDs and generate the following equivalent r3 rlwinm or r4 stw r3 blr
Initial value:
===-------------------------------------------------------------------------===
Squish small scalar globals together into a single global struct

Definition at line 108 of file README.txt.

◆ code

entry mr r2 blr This could be reduced to the much andc r2 r3 slwi or r2 rlwimi stw r3 blr We could collapse a bunch of those ORs and ANDs and generate the following equivalent code

Definition at line 282 of file README.txt.

◆ cr0

A predicate compare being used in a select_cc should have the same peephole applied to it as a predicate compare used by a br_cc There should be no mfcr oris r5 li li lvx r4 lvx r3 vcmpeqfp v2 mfcr rlwinm cmpwi bne cr0

Definition at line 44 of file README.txt.

◆ cr7

Should compile to something r4 addze r3 instead we r3 cmplw cr7

Definition at line 25 of file README.txt.

◆ f0

We f2 lis f0

Definition at line 76 of file README.txt.

◆ f1

entry mflr r11 ***stw r1 bl L00000 $pb L00000 f1

Definition at line 76 of file README.txt.

◆ f2

cond_true lis lo16() lo16() lo16() f1 fsel f2

Definition at line 105 of file README.txt.

◆ functional

entry mflr r11*** stw r1 bl L00000 $pb L00000 f0*** lwz r1 mtlr r11 blr This is functional

Definition at line 311 of file README.txt.

◆ generate

We generate

Definition at line 72 of file README.txt.

◆ get

Should compile to something r4 addze r3 instead we get

Definition at line 24 of file README.txt.

Referenced by llvm::R600InstrInfo::addFlag(), llvm::Mips16InstrInfo::AddiuSpImm(), llvm::M68kInstrInfo::AddSExt(), llvm::M68kInstrInfo::AddZExt(), llvm::MipsSEInstrInfo::adjustStackPtr(), llvm::AVRInstrInfo::analyzeBranch(), llvm::M68kInstrInfo::AnalyzeBranchImpl(), llvm::SIInstrInfo::areLoadsFromSameBasePtr(), llvm::ARMBaseInstrInfo::breakPartialRegDependency(), llvm::X86InstrInfo::breakPartialRegDependency(), llvm::R600InstrInfo::buildDefaultInstruction(), llvm::MachineIRBuilder::buildDirectDbgValue(), llvm::SIInstrInfo::buildExtractSubReg(), buildFrameDebugInfo(), buildFrameType(), llvm::MachineIRBuilder::buildIndirectDbgValue(), llvm::MachineIRBuilder::buildInstrNoInsert(), llvm::RISCVInstrInfo::buildOutlinedFrame(), llvm::AArch64InstrInfo::buildOutlinedFrame(), llvm::ARMBaseInstrInfo::buildOutlinedFrame(), llvm::X86InstrInfo::buildOutlinedFrame(), llvm::SIInstrInfo::buildShrunkInst(), llvm::canSinkOrHoistInst(), llvm::HexagonInstrInfo::changeDuplexOpcode(), llvm::X86InstrInfo::classifyLEAReg(), llvm::R600InstrInfo::clearFlag(), llvm::PPCInstrInfo::combineRLWINM(), combineVSelectWithAllOnesOrZeros(), llvm::RISCVInstrInfo::commuteInstructionImpl(), llvm::SIInstrInfo::commuteInstructionImpl(), llvm::X86InstrInfo::commuteInstructionImpl(), llvm::SIInstrInfo::convertNonUniformIfRegion(), llvm::SIInstrInfo::convertNonUniformLoopRegion(), llvm::ARMBaseInstrInfo::convertToThreeAddress(), llvm::RISCVInstrInfo::convertToThreeAddress(), llvm::X86InstrInfo::convertToThreeAddress(), llvm::SIInstrInfo::convertToThreeAddress(), llvm::ARMBaseInstrInfo::copyFromCPSR(), llvm::AArch64InstrInfo::copyGPRRegTuple(), llvm::BPFInstrInfo::copyPhysReg(), llvm::MSP430InstrInfo::copyPhysReg(), llvm::RISCVInstrInfo::copyPhysReg(), llvm::Thumb1InstrInfo::copyPhysReg(), llvm::Thumb2InstrInfo::copyPhysReg(), llvm::MipsSEInstrInfo::copyPhysReg(), llvm::WebAssemblyInstrInfo::copyPhysReg(), llvm::LanaiInstrInfo::copyPhysReg(), llvm::Mips16InstrInfo::copyPhysReg(), llvm::NVPTXInstrInfo::copyPhysReg(), llvm::XCoreInstrInfo::copyPhysReg(), llvm::ARCInstrInfo::copyPhysReg(), llvm::AVRInstrInfo::copyPhysReg(), llvm::VEInstrInfo::copyPhysReg(), llvm::SparcInstrInfo::copyPhysReg(), llvm::AArch64InstrInfo::copyPhysReg(), llvm::HexagonInstrInfo::copyPhysReg(), llvm::SIInstrInfo::copyPhysReg(), llvm::ARMBaseInstrInfo::copyPhysReg(), llvm::SystemZInstrInfo::copyPhysReg(), llvm::M68kInstrInfo::copyPhysReg(), llvm::X86InstrInfo::copyPhysReg(), llvm::PPCInstrInfo::copyPhysReg(), llvm::ARMBaseInstrInfo::copyToCPSR(), llvm::SIInstrInfo::createPHIDestinationCopy(), llvm::SIInstrInfo::createPHISourceCopy(), DbgGatherEqualValues(), emitGetSwiftErrorValue(), emitSetSwiftErrorValue(), llvm::MCStreamer::EmitWinCFIEndProc(), llvm::CodeViewDebug::endFunctionImpl(), llvm::M68kInstrInfo::ExpandCCR(), llvm::ARMBaseInstrInfo::expandLoadStackGuardBase(), llvm::SIInstrInfo::expandMovDPP64(), llvm::M68kInstrInfo::ExpandMOVSZX_RR(), llvm::M68kInstrInfo::ExpandMOVX_RR(), llvm::SparcInstrInfo::expandPostRAPseudo(), llvm::VEInstrInfo::expandPostRAPseudo(), llvm::HexagonInstrInfo::expandPostRAPseudo(), llvm::ARMBaseInstrInfo::expandPostRAPseudo(), llvm::SIInstrInfo::expandPostRAPseudo(), llvm::AArch64InstrInfo::expandPostRAPseudo(), llvm::SystemZInstrInfo::expandPostRAPseudo(), llvm::M68kInstrInfo::expandPostRAPseudo(), llvm::X86InstrInfo::expandPostRAPseudo(), llvm::PPCInstrInfo::expandPostRAPseudo(), llvm::HexagonInstrInfo::expandVGatherPseudo(), llvm::PPCInstrInfo::expandVSXMemPseudo(), llvm::VEInstrInfo::FoldImmediate(), llvm::SystemZInstrInfo::FoldImmediate(), llvm::ARMBaseInstrInfo::FoldImmediate(), llvm::SIInstrInfo::FoldImmediate(), llvm::SystemZInstrInfo::foldMemoryOperandImpl(), llvm::X86InstrInfo::foldMemoryOperandImpl(), llvm::HexagonInstrInfo::genAllInsnTimingClasses(), llvm::MipsInstrInfo::genInstrWithNewOpc(), llvm::VPTransformState::get(), llvm::gvn::AvailableValueInBlock::get(), llvm::Expected< ExpressionValue >::get(), llvm::SIInstrInfo::getAddNoCarry(), llvm::AVRInstrInfo::getBrCond(), llvm::AArch64InstrInfo::getElementSizeForOpcode(), llvm::R600InstrInfo::getFlagOp(), llvm::SparcInstrInfo::getGlobalBaseReg(), llvm::VEInstrInfo::getGlobalBaseReg(), llvm::DSOLocalEquivalent::getGlobalValue(), llvm::SIInstrInfo::getIndirectGPRIDXPseudo(), llvm::SIInstrInfo::getIndirectRegWriteMovRelPseudo(), llvm::RISCVInstrInfo::getInstSizeInBytes(), llvm::AVRInstrInfo::getInstSizeInBytes(), llvm::PPCInstrInfo::getInstSizeInBytes(), getIntSequenceIfElementsMatch(), llvm::SIInstrInfo::getKillTerminatorFromPseudo(), llvm::SIInstrInfo::getMCOpcodeFromPseudo(), llvm::SelectionDAG::getNode(), llvm::SystemZInstrInfo::getOpcodeForOffset(), llvm::ARMBaseInstrInfo::getOperandLatency(), llvm::SDNode::getOperationName(), llvm::SIInstrInfo::getOpRegClass(), llvm::SIInstrInfo::getOpSize(), llvm::WebAssembly::SortRegionInfo::getRegionFor(), llvm::pdb::SymbolCache::getSourceFileById(), llvm::BranchInst::getSuccessor(), getTypePartition(), llvm::gvn::AvailableValueInBlock::getUndef(), llvm::SIInstrInfo::hasFPClamp(), llvm::R600InstrInfo::hasInstrModifiers(), llvm::BPFInstrInfo::insertBranch(), llvm::XCoreInstrInfo::insertBranch(), llvm::ARCInstrInfo::insertBranch(), llvm::WebAssemblyInstrInfo::insertBranch(), llvm::NVPTXInstrInfo::insertBranch(), llvm::RISCVInstrInfo::insertBranch(), llvm::MSP430InstrInfo::insertBranch(), llvm::MipsInstrInfo::insertBranch(), llvm::VEInstrInfo::insertBranch(), llvm::SparcInstrInfo::insertBranch(), llvm::AVRInstrInfo::insertBranch(), llvm::HexagonInstrInfo::insertBranch(), llvm::LanaiInstrInfo::insertBranch(), llvm::ARMBaseInstrInfo::insertBranch(), llvm::R600InstrInfo::insertBranch(), llvm::AArch64InstrInfo::insertBranch(), llvm::SystemZInstrInfo::insertBranch(), llvm::M68kInstrInfo::insertBranch(), llvm::SIInstrInfo::insertBranch(), llvm::X86InstrInfo::insertBranch(), llvm::PPCInstrInfo::insertBranch(), llvm::SIInstrInfo::insertEQ(), llvm::RISCVInstrInfo::insertIndirectBranch(), llvm::AVRInstrInfo::insertIndirectBranch(), llvm::SIInstrInfo::insertIndirectBranch(), llvm::SIInstrInfo::insertNE(), llvm::MipsInstrInfo::insertNoop(), llvm::HexagonInstrInfo::insertNoop(), llvm::PPCInstrInfo::insertNoop(), llvm::SIInstrInfo::insertNoops(), llvm::RISCVInstrInfo::insertOutlinedCall(), llvm::AArch64InstrInfo::insertOutlinedCall(), llvm::ARMBaseInstrInfo::insertOutlinedCall(), llvm::X86InstrInfo::insertOutlinedCall(), llvm::SIInstrInfo::insertReturn(), InsertRootInitializers(), llvm::AArch64InstrInfo::insertSelect(), llvm::SystemZInstrInfo::insertSelect(), llvm::SIInstrInfo::insertSelect(), llvm::X86InstrInfo::insertSelect(), llvm::PPCInstrInfo::insertSelect(), insertSpills(), insertVector(), llvm::SIInstrInfo::insertVectorSelect(), llvm::HexagonInstrInfo::invertAndChangeJumpTarget(), llvm::R600InstrInfo::isALUInstr(), llvm::SIInstrInfo::isAtomic(), llvm::SIInstrInfo::isAtomicNoRet(), llvm::SIInstrInfo::isAtomicRet(), llvm::SIInstrInfo::isDisableWQM(), llvm::SIInstrInfo::isDOT(), llvm::SIInstrInfo::isDPP(), llvm::SIInstrInfo::isDS(), llvm::SIInstrInfo::isEXP(), llvm::R600InstrInfo::isExport(), llvm::SIInstrInfo::isFixedSize(), llvm::SIInstrInfo::isFLAT(), llvm::SIInstrInfo::isFLATGlobal(), llvm::SIInstrInfo::isFLATScratch(), llvm::HexagonInstrInfo::isFloat(), llvm::SIInstrInfo::isFPAtomic(), llvm::SIInstrInfo::isGather4(), llvm::SIInstrInfo::isHighLatencyDef(), llvm::R600InstrInfo::isLDSInstr(), isLoadInvariantInLoop(), llvm::SIInstrInfo::isMAI(), IsMemoryAssignmentError(), llvm::SIInstrInfo::isMIMG(), llvm::SIInstrInfo::isMTBUF(), llvm::SIInstrInfo::isMUBUF(), llvm::HexagonInstrInfo::isNewValue(), llvm::HexagonInstrInfo::isNewValueJump(), llvm::HexagonInstrInfo::isNewValueStore(), llvm::SIInstrInfo::isPacked(), llvm::HexagonInstrInfo::isPredicated(), llvm::HexagonInstrInfo::isPredicatedNew(), llvm::HexagonInstrInfo::isPredicatedTrue(), llvm::HexagonInstrInfo::isPredicateLate(), llvm::HexagonInstrInfo::isPredictedTaken(), llvm::PPCInstrInfo::isPrefixed(), llvm::AArch64InstrInfo::isPTestLikeOpcode(), llvm::R600InstrInfo::isRegisterLoad(), llvm::R600InstrInfo::isRegisterStore(), llvm::SIInstrInfo::isSALU(), llvm::SIInstrInfo::isScalarStore(), llvm::SIInstrInfo::isSDWA(), llvm::SIInstrInfo::isSegmentSpecificFLAT(), llvm::SIInstrInfo::isSGPRSpill(), llvm::SIInstrInfo::isSMRD(), llvm::SIInstrInfo::isSOP1(), llvm::SIInstrInfo::isSOP2(), llvm::SIInstrInfo::isSOPC(), llvm::SIInstrInfo::isSOPK(), llvm::SIInstrInfo::isSOPP(), llvm::SIInstrInfo::isTRANS(), llvm::R600InstrInfo::isTransOnly(), llvm::SIInstrInfo::isVALU(), llvm::HexagonInstrInfo::isVecALU(), llvm::R600InstrInfo::isVector(), llvm::R600InstrInfo::isVectorOnly(), isVectorPromotionViableForSlice(), llvm::SIInstrInfo::isVGPRSpill(), llvm::SIInstrInfo::isVINTRP(), llvm::SIInstrInfo::isVOP1(), llvm::SIInstrInfo::isVOP2(), llvm::SIInstrInfo::isVOP3(), llvm::SIInstrInfo::isVOP3P(), llvm::SIInstrInfo::isVOPC(), llvm::AArch64InstrInfo::isWhileOpcode(), llvm::SIInstrInfo::isWQM(), llvm::PPCInstrInfo::isXFormMemOp(), llvm::SIInstrInfo::legalizeGenericOperand(), llvm::SIInstrInfo::legalizeOperandsVOP2(), llvm::SIInstrInfo::legalizeOperandsVOP3(), llvm::SIInstrInfo::legalizeOpWithMove(), LLVMDIBuilderGetOrCreateArray(), LLVMDIBuilderGetOrCreateTypeArray(), LLVMGetUsedValue(), LLVMOrcExecutionSessionGetSymbolStringPool(), llvm::MipsSEInstrInfo::loadImmediate(), llvm::XCoreInstrInfo::loadImmediate(), llvm::Mips16InstrInfo::loadImmediate(), llvm::ARCInstrInfo::loadImmediate(), llvm::SystemZInstrInfo::loadImmediate(), llvm::MipsSEInstrInfo::loadRegFromStack(), llvm::Mips16InstrInfo::loadRegFromStack(), llvm::BPFInstrInfo::loadRegFromStackSlot(), llvm::MSP430InstrInfo::loadRegFromStackSlot(), llvm::RISCVInstrInfo::loadRegFromStackSlot(), llvm::Thumb1InstrInfo::loadRegFromStackSlot(), llvm::Thumb2InstrInfo::loadRegFromStackSlot(), llvm::LanaiInstrInfo::loadRegFromStackSlot(), llvm::XCoreInstrInfo::loadRegFromStackSlot(), llvm::ARCInstrInfo::loadRegFromStackSlot(), llvm::AVRInstrInfo::loadRegFromStackSlot(), llvm::SparcInstrInfo::loadRegFromStackSlot(), llvm::VEInstrInfo::loadRegFromStackSlot(), llvm::AArch64InstrInfo::loadRegFromStackSlot(), llvm::HexagonInstrInfo::loadRegFromStackSlot(), llvm::ARMBaseInstrInfo::loadRegFromStackSlot(), llvm::SIInstrInfo::loadRegFromStackSlot(), llvm::SystemZInstrInfo::loadRegFromStackSlot(), llvm::M68kInstrInfo::loadRegFromStackSlot(), llvm::X86InstrInfo::loadRegFromStackSlot(), LowerSETCCCARRY(), llvm::Mips16InstrInfo::makeFrame(), llvm::SIInstrInfo::materializeImmediate(), llvm::SIInstrInfo::moveFlatAddrToVGPR(), llvm::SIInstrInfo::moveToVALU(), llvm::RISCVInstrInfo::movImm(), llvm::APSInt::operator!=(), llvm::APSInt::operator<(), llvm::APSInt::operator<=(), llvm::APSInt::operator==(), llvm::APSInt::operator>(), llvm::APSInt::operator>=(), optimizeBranch(), llvm::LanaiInstrInfo::optimizeCompareInstr(), llvm::AArch64InstrInfo::optimizeCompareInstr(), llvm::X86InstrInfo::optimizeCompareInstr(), llvm::PPCInstrInfo::optimizeCompareInstr(), llvm::AArch64InstrInfo::optimizeCondBranch(), llvm::Thumb2InstrInfo::optimizeSelect(), llvm::AMDGPUTargetLowering::performSelectCombine(), llvm::ARMBaseInstrInfo::PredicateInstruction(), llvm::HexagonInstrInfo::PredicateInstruction(), llvm::SystemZInstrInfo::PredicateInstruction(), llvm::PPCInstrInfo::PredicateInstruction(), llvm::SIInstrInfo::pseudoToMCOpcode(), llvm::vfs::OverlayFileSystem::pushOverlay(), QualifiedNameOfImplicitName(), QualifyName(), llvm::SIInstrInfo::readlaneVGPRToSGPR(), llvm::ARMBaseInstrInfo::reMaterialize(), llvm::X86InstrInfo::reMaterialize(), llvm::R600InstrInfo::removeBranch(), llvm::X86InstrInfo::replaceBranchWithTailCall(), llvm::PPCInstrInfo::replaceInstrWithLI(), llvm::Mips16InstrInfo::restoreFrame(), llvm::HexagonInstrInfo::reverseBranchCondition(), llvm::HexagonInstrInfo::reversePredSense(), rewritePHIsForCleanupPad(), llvm::runIPSCCP(), llvm::ScaledNumber< uint64_t >::scale(), llvm::ARMBaseInstrInfo::setExecutionDomain(), llvm::X86InstrInfo::setExecutionDomain(), llvm::X86InstrInfo::setExecutionDomainCustom(), sink(), solveTypeName(), llvm::SIInstrInfo::sopkIsZext(), splitMergedValStore(), llvm::MipsSEInstrInfo::storeRegToStack(), llvm::Mips16InstrInfo::storeRegToStack(), llvm::BPFInstrInfo::storeRegToStackSlot(), llvm::MSP430InstrInfo::storeRegToStackSlot(), llvm::Thumb1InstrInfo::storeRegToStackSlot(), llvm::RISCVInstrInfo::storeRegToStackSlot(), llvm::Thumb2InstrInfo::storeRegToStackSlot(), llvm::LanaiInstrInfo::storeRegToStackSlot(), llvm::XCoreInstrInfo::storeRegToStackSlot(), llvm::ARCInstrInfo::storeRegToStackSlot(), llvm::AVRInstrInfo::storeRegToStackSlot(), llvm::SparcInstrInfo::storeRegToStackSlot(), llvm::VEInstrInfo::storeRegToStackSlot(), llvm::AArch64InstrInfo::storeRegToStackSlot(), llvm::HexagonInstrInfo::storeRegToStackSlot(), llvm::ARMBaseInstrInfo::storeRegToStackSlot(), llvm::SIInstrInfo::storeRegToStackSlot(), llvm::SystemZInstrInfo::storeRegToStackSlot(), llvm::M68kInstrInfo::storeRegToStackSlot(), llvm::X86InstrInfo::storeRegToStackSlot(), llvm::X86InstrInfo::unfoldMemoryOperand(), llvm::CalleeInfo::updateRelBlockFreq(), llvm::SIInstrInfo::usesFPDPRounding(), llvm::R600InstrInfo::usesTextureCache(), llvm::R600InstrInfo::usesVertexCache(), llvm::slpvectorizer::BoUpSLP::vectorizeTree(), and llvm::SIInstrInfo::verifyInstruction().

◆ group

bb420 i The CBE manages to mtctr r0 r11 stbx r9 addi bdz later b loop This could be much the loop would be a single dispatch group
Initial value:
===-------------------------------------------------------------------------===
Lump the constant pool for each function into ONE pic object

Definition at line 61 of file README.txt.

Referenced by source_group().

◆ Ick

rlwinm add r4 Ick
Initial value:
===-------------------------------------------------------------------------===
We compile the hottest inner loop of viterbi to:
li r6

Definition at line 32 of file README.txt.

◆ LBB1_2

gets compiled into this on rsp movaps rsp movaps rsp movaps rsp movaps rsp movaps rsp movaps rsp movaps rsp movaps rsp movq rsp movq rsp movq rsp movq rsp movq rsp rax movq rsp rax movq rsp rsp rsp eax eax jbe LBB1_3 rcx rax movq rsp LBB1_2

Definition at line 201 of file README.txt.

◆ LBB1_83

bb432 i mr r6 cmplwi bne LBB1_83

Definition at line 38 of file README.txt.

◆ LBB1_84

bb420 i lbzx r7 addi stbx r7 LBB1_84

Definition at line 37 of file README.txt.

◆ LBB2_1

entry LBB2_1

Definition at line 165 of file README.txt.

◆ LBB2_2

entry mr r3 LBB2_2

Definition at line 164 of file README.txt.

◆ like

Where MAX_UNSIGNED state is a bit int On a bit platform it would be just so cool to turn it into something like

Definition at line 19 of file README.txt.

◆ loop

bb420 i The CBE manages to mtctr r0 loop

Definition at line 52 of file README.txt.

◆ produce

bb420 i The CBE manages to produce

Definition at line 49 of file README.txt.

◆ produces

entry stw r5 blr GCC produces

Definition at line 174 of file README.txt.

◆ r0

entry stw r5 blr GCC r3 srawi xor r4 subf r0 stw r0

Definition at line 53 of file README.txt.

◆ r11

entry mflr r11 ***stw r1 bl L00000 $pb L00000 f0 ***lwz r11

Definition at line 300 of file README.txt.

◆ r2

entry mflr r11*** stw r1 bl L00000 $pb L00000 r2

Definition at line 52 of file README.txt.

◆ r3

entry mr r2 blr This could be reduced to the much andc r3

Definition at line 19 of file README.txt.

◆ r4

entry mr r2 blr This could be reduced to the much andc r2 r3 slwi or r2 rlwimi stw r3 blr We could collapse a bunch of those ORs and ANDs and generate the following equivalent r3 rlwinm r4

Definition at line 24 of file README.txt.

◆ r5

bb420 i lbzx r5

Definition at line 39 of file README.txt.

◆ r6

entry mr r6

Definition at line 40 of file README.txt.

◆ r7

bb432 i mr r6 cmplwi r7

Definition at line 40 of file README.txt.

◆ r8

bb420 i lbzx r7 addi stbx r8

Definition at line 39 of file README.txt.

◆ register

We f2 lis f2 blr It would be better to materialize CPI_X into a register

Definition at line 84 of file README.txt.

◆ simpler

entry mr r2 blr This could be reduced to the much simpler

Definition at line 210 of file README.txt.

◆ than

So that lo16() r2 stb r3 blr Becomes r3 they should compile to something better than

Definition at line 161 of file README.txt.

◆ this

entry mr r2 blr This could be reduced to the much andc r2 r3 slwi or r2 rlwimi stw r3 blr We could collapse a bunch of those ORs and ANDs and generate the following equivalent r3 rlwinm or r4 stw r3 so we need to get the LR register This ends up producing code like this

Definition at line 97 of file README.txt.

◆ x

TODO unsigned x
Initial value:
{
unsigned z = sum + x

Definition at line 10 of file README.txt.

Referenced by a(), llvm::FoldingSetNodeID::Add(), llvm::FoldingSetBucketIteratorImpl::advance(), llvm::IntervalMap< KeyT, ValT, N, Traits >::const_iterator::advanceTo(), llvm::IntervalMapOverlaps< MapA, MapB >::advanceTo(), llvm::HexagonFrameLowering::assignCalleeSavedSpillSlots(), bar(), llvm::capacity_in_bytes(), llvm::ComputeEditDistance(), computeGREV(), llvm::CrashRecoveryContextCleanupBase< CrashRecoveryContextDestructorCleanup< T >, T >::create(), dump_registers(), llvm::encodeBase64(), exit(), llvm::LiveIntervalUnion::find(), llvm::IntervalMap< IndexT, char >::find(), llvm::IntervalMap< KeyT, ValT, N, Traits >::const_iterator::find(), llvm::IntervalMapImpl::LeafNode< KeyT, ValT, N, Traits >::findFrom(), llvm::IntervalMapImpl::BranchNode< KeyT, ValT, RootBranchCap, Traits >::findFrom(), foo(), g(), llvm::getMaxNReg(), llvm::getMaxNTIDx(), llvm::getMinCTASm(), llvm::ScalarEvolution::getMulExpr(), llvm::getReqNTIDx(), llvm::HexagonRegisterInfo::getReservedRegs(), llvm::IntervalIterator< NodeTy, OrigContainer_t, GT, IGT >::IntervalIterator(), into(), llvm::isInt(), llvm::isInt< 16 >(), llvm::isInt< 32 >(), llvm::isInt< 8 >(), llvm::isIntN(), llvm::isKernelFunction(), llvm::isShiftedInt(), llvm::isShiftedUInt(), isShiftedUIntAtAnyPosition(), llvm::isUInt< 16 >(), llvm::isUInt< 32 >(), llvm::isUInt< 8 >(), llvm::isUIntN(), LLVMInitializeLanaiAsmParser(), llvm::IntervalMap< IndexT, char >::lookup(), llvm::make_range(), needsStackFrame(), llvm::generic_gep_type_iterator< ItTy >::operator!=(), llvm::HexagonBlockRanges::IndexType::operator!=(), llvm::PredIterator< Ptr, USE_iterator >::operator!=(), llvm::IntervalIterator< NodeTy, OrigContainer_t, GT, IGT >::operator!=(), llvm::RNSuccIterator< NodeRef, BlockT, RegionT >::operator!=(), llvm::po_iterator< Inverse< T >, std::set< typename GraphTraits< T >::NodeRef >, false >::operator!=(), llvm::df_iterator< T, std::set< typename GraphTraits< T >::NodeRef >, true >::operator!=(), llvm::RNSuccIterator< FlatIt< NodeRef >, BlockT, RegionT >::operator!=(), llvm::SSAUpdaterTraits< SSAUpdater >::PHI_iterator::operator!=(), llvm::AliasSet::iterator::operator!=(), llvm::SSAUpdaterTraits< MachineSSAUpdater >::PHI_iterator::operator!=(), llvm::SUnitIterator::operator!=(), llvm::TargetRegistry::iterator::operator!=(), llvm::ImutAVLTreeGenericIterator< ImutInfo >::operator!=(), llvm::SDNode::use_iterator::operator!=(), llvm::ImutAVLTreeInOrderIterator< ImutInfo >::operator!=(), llvm::MachineRegisterInfo::defusechain_iterator< ReturnUses, ReturnDefs, SkipDebug, ByOperand, ByInstr, ByBundle >::operator!=(), llvm::MachineRegisterInfo::defusechain_instr_iterator< ReturnUses, ReturnDefs, SkipDebug, ByOperand, ByInstr, ByBundle >::operator!=(), llvm::SDNodeIterator::operator!=(), llvm::generic_gep_type_iterator< ItTy >::operator==(), llvm::HexagonBlockRanges::IndexType::operator==(), llvm::PredIterator< Ptr, USE_iterator >::operator==(), llvm::scc_iterator< GraphT, GT >::operator==(), llvm::RNSuccIterator< NodeRef, BlockT, RegionT >::operator==(), llvm::IntervalIterator< NodeTy, OrigContainer_t, GT, IGT >::operator==(), llvm::po_iterator< Inverse< T >, std::set< typename GraphTraits< T >::NodeRef >, false >::operator==(), llvm::df_iterator< T, std::set< typename GraphTraits< T >::NodeRef >, true >::operator==(), llvm::SuccIterator< InstructionT, BlockT >::operator==(), llvm::RNSuccIterator< FlatIt< NodeRef >, BlockT, RegionT >::operator==(), llvm::SSAUpdaterTraits< SSAUpdater >::PHI_iterator::operator==(), llvm::AliasSet::iterator::operator==(), llvm::SSAUpdaterTraits< MachineSSAUpdater >::PHI_iterator::operator==(), llvm::SUnitIterator::operator==(), llvm::TargetRegistry::iterator::operator==(), llvm::ImutAVLTreeGenericIterator< ImutInfo >::operator==(), llvm::SDNode::use_iterator::operator==(), llvm::ImutAVLTreeInOrderIterator< ImutInfo >::operator==(), llvm::MachineRegisterInfo::defusechain_iterator< ReturnUses, ReturnDefs, SkipDebug, ByOperand, ByInstr, ByBundle >::operator==(), llvm::MachineRegisterInfo::defusechain_instr_iterator< ReturnUses, ReturnDefs, SkipDebug, ByOperand, ByInstr, ByBundle >::operator==(), llvm::SDNodeIterator::operator==(), llvm::sys::fs::operator~(), llvm::IntervalMap< KeyT, ValT, N, Traits >::const_iterator::pathFillFind(), llvm::DomTreeBuilder::SemiNCAInfo< DomTreeT >::RemoveRedundantRoots(), runNVVMIntrRange(), llvm::IntervalMapImpl::LeafNode< KeyT, ValT, N, Traits >::safeFind(), llvm::IntervalMapImpl::BranchNode< KeyT, ValT, RootBranchCap, Traits >::safeFind(), llvm::IntervalMapImpl::LeafNode< KeyT, ValT, N, Traits >::safeLookup(), llvm::IntervalMapImpl::BranchNode< KeyT, ValT, RootBranchCap, Traits >::safeLookup(), llvm::IntervalMap< KeyT, ValT, N, Traits >::iterator::setValue(), llvm::IntervalMap< KeyT, ValT, N, Traits >::iterator::setValueUnchecked(), llvm::SignExtend64(), llvm::IntervalMapInfo< T >::startLess(), llvm::IntervalMapHalfOpenInfo< SlotIndex >::startLess(), llvm::IntervalMapInfo< T >::stopLess(), llvm::IntervalMapHalfOpenInfo< SlotIndex >::stopLess(), llvm::MachO::swapStruct(), this(), llvm::IntervalMap< KeyT, ValT, N, Traits >::const_iterator::treeAdvanceTo(), llvm::IntervalMap< KeyT, ValT, N, Traits >::const_iterator::treeFind(), and llvm::write_hex().

◆ z

return z
z
return z
Definition: README.txt:14
constant
we should consider alternate ways to model stack dependencies Lots of things could be done in WebAssemblyTargetTransformInfo cpp there are numerous optimization related hooks that can be overridden in WebAssemblyTargetLowering Instead of the OptimizeReturned which should consider preserving the returned attribute through to MachineInstrs and extending the MemIntrinsicResults pass to do this optimization on calls too That would also let the WebAssemblyPeephole pass clean up dead defs for such as it does for stores Consider implementing and or getMachineCombinerPatterns Find a clean way to fix the problem which leads to the Shrink Wrapping pass being run after the WebAssembly PEI pass When setting multiple variables to the same constant
Definition: README.txt:91
llvm::tgtok::Code
@ Code
Definition: TGLexer.h:50
pool
Implement PPCInstrInfo::isLoadFromStackSlot isStoreToStackSlot for vector to generate better spill code The first should be a single lvx from the constant pool
Definition: README_ALTIVEC.txt:8
to
Should compile to
Definition: README.txt:449
PIC
PassInstrumentationCallbacks PIC
Definition: PassBuilderBindings.cpp:55
loop
Analysis the ScalarEvolution expression for r is< loop > Outside the loop
Definition: README.txt:8
a
=0.0 ? 0.0 :(a > 0.0 ? 1.0 :-1.0) a
Definition: README.txt:489
into
Clang compiles this into
Definition: README.txt:504
r6
bb420 i lbzx r7 addi r6
Definition: README.txt:40
x
TODO unsigned x
Definition: README.txt:10
globals
name anon globals
Definition: NameAnonGlobals.cpp:113
optimization
bar al al movzbl eax ret Missed optimization
Definition: README.txt:1411
of
Add support for conditional and other related patterns Instead of
Definition: README.txt:134