LLVM
13.0.0git
|
#include <string.h>
#include <altivec.h>
Typedefs | |
using | C = (vector float) vec_cmpeq(*A, *B) |
Functions | |
Implement PPCInstrInfo::isLoadFromStackSlot isStoreToStackSlot for vector to generate better spill code The first should be a single lvx from the constant the second should be a xor | bar (x) |
void | foo (void) |
if (!vec_any_eq(*A, *B)) *B | |
float space text globl _test align oris r3 lis | ha16 (LCPI1_0) addi r4 |
float space text globl _test align oris r3 lis stfs r1 addi lfs lo16() | LCPI1_0 (r3) stfs f0 |
float space text globl _test align oris r3 lis stfs r1 addi lfs lo16() r1 lvx r4 lvx r5 vmrghw v2 vspltw vmrghw v3 r2 blr int | foo (vector float *x, vector float *y) |
if (vec_all_ge(a, b)) aa|=0x1 | |
if (vec_any_ge(a, b)) aa|=0x2 | |
vector float | f (vector float a, vector float b) |
Variables | |
Implement PPCInstrInfo::isLoadFromStackSlot isStoreToStackSlot for vector | registers |
Implement PPCInstrInfo::isLoadFromStackSlot isStoreToStackSlot for vector to generate better spill code The first should be a single lvx from the constant | pool |
Implement PPCInstrInfo::isLoadFromStackSlot isStoreToStackSlot for vector to generate better spill code The first should be a single lvx from the constant the second should be a xor | stvx |
Altivec | __pad2__ |
Altivec | not |
Altivec we can use Consider | this |
v4f32 | Vector2 = { Vector.X, Vector.X, Vector.X, Vector.X } |
Since we know that Vector is byte aligned and we know the element offset of | X |
Since we know that Vector is byte aligned and we know the element offset of we should change the load into a lve *x | instruction |
Since we know that Vector is byte aligned and we know the element offset of we should change the load into a lve *x instead of doing a load store lve *x sequence Implement passing vectors by value into calls and receiving them as arguments GCC apparently tries to | codegen |
Since we know that Vector is byte aligned and we know the element offset of we should change the load into a lve *x instead of doing a load store lve *x sequence Implement passing vectors by value into calls and receiving them as arguments GCC apparently tries to then a load and vperm of Variable We need a way to teach tblgen that some operands of an intrinsic are required to be constants The verifier should enforce this constraint We currently codegen SCALAR_TO_VECTOR as a store of the scalar to a byte aligned stack | slot |
Since we know that Vector is byte aligned and we know the element offset of we should change the load into a lve *x instead of doing a load store lve *x sequence Implement passing vectors by value into calls and receiving them as arguments GCC apparently tries to then a load and vperm of Variable We need a way to teach tblgen that some operands of an intrinsic are required to be constants The verifier should enforce this constraint We currently codegen SCALAR_TO_VECTOR as a store of the scalar to a byte aligned stack followed by a load vperm We should probably just store it to a scalar stack then use lvsl vperm to load it If the value is already in memory this is a big win extract_vector_elt of an arbitrary constant vector can be done with the following | instructions |
Since we know that Vector is byte aligned and we know the element offset of we should change the load into a lve *x instead of doing a load store lve *x sequence Implement passing vectors by value into calls and receiving them as arguments GCC apparently tries to then a load and vperm of Variable We need a way to teach tblgen that some operands of an intrinsic are required to be constants The verifier should enforce this constraint We currently codegen SCALAR_TO_VECTOR as a store of the scalar to a byte aligned stack followed by a load vperm We should probably just store it to a scalar stack then use lvsl vperm to load it If the value is already in memory this is a big win extract_vector_elt of an arbitrary constant vector can be done with the following vec_ste & | destloc |
* | A = C |
we get the following basic | block |
we get the following basic r4 lvx | v3 |
we get the following basic r4 lvx r3 vcmpeqfp | v4 |
we get the following basic r4 lvx r3 vcmpeqfp v2 vcmpeqfp | v2 |
we get the following basic r4 lvx r3 vcmpeqfp v2 vcmpeqfp v2 bne | cr6 |
we get the following basic r4 lvx r3 vcmpeqfp v2 vcmpeqfp v2 bne | LBB1_2 |
cond_next The vcmpeqfp vcmpeqfp instructions currently cannot be merged when the vcmpeqfp result is used by a branch This can be improved The code generated for this is truly | aweful |
cond_next The vcmpeqfp vcmpeqfp instructions currently cannot be merged when the vcmpeqfp result is used by a branch This can be improved The code generated for this is truly float | b |
LCPI1_0 | __pad3__ |
float space text globl _test align | _test |
float space text globl _test align oris | r3 |
float space text globl _test align oris | r2 |
float space text globl _test align oris | mtspr |
float space text globl _test align oris r3 lis | r1 |
float space text globl _test align oris r3 lis stfs | f1 |
float space text globl _test align oris r3 lis stfs r1 addi | r5 |
float space text globl _test align oris r3 lis stfs r1 addi lfs | f0 |
A predicate compare being used in a select_cc should have the same peephole applied to it as a predicate compare used by a br_cc There should be no mfcr | here |
A predicate compare being used in a select_cc should have the same peephole applied to it as a predicate compare used by a br_cc There should be no mfcr oris r5 li li | r6 |
A predicate compare being used in a select_cc should have the same peephole applied to it as a predicate compare used by a br_cc There should be no mfcr oris r5 li li lvx r4 lvx r3 vcmpeqfp v2 mfcr rlwinm cmpwi | cr0 |
entry | LBB1_1 |
entry mr r6 r2 blr CodeGen PowerPC vec_constants ll has an and operation that should be codegen d to andc The issue is that the all ones build vector is SelectNodeTo d a VSPLTISB instruction node before the and xor is selected which prevents the vnot pattern from matching An alternative to the store store load approach for illegal insert element lowering would | be |
splat | index |
vcmpeq to generate a select mask lvsl slot | x |
vperm to rotate result into correct slot vsel result together Should codegen branches on vec_any vec_all to avoid mfcr Two | examples |
return | aa |
We should do a little better with eliminating dead stores The stores to the stack are dead since a and b are not | needed |
Function | Attrs |
Function align align store< 16 x i8 >< i8 1, i8 2, i8 3, i8 4, i8 5, i8 6, i8 7, i8 8, i8 9, i8 10, i8 11, i8 12, i8 13, i8 14, i8 15, i8 16 >< 16 x i8 > * | a |
Function align align store< 16 x i8 >< i8 1, i8 2, i8 3, i8 4, i8 5, i8 6, i8 7, i8 8, i8 9, i8 10, i8 11, i8 12, i8 13, i8 14, i8 15, i8 16 >< 16 x i8 > align store< 16 x i8 >< i8 113, i8 114, i8 115, i8 116, i8 117, i8 118, i8 119, i8 120, i8 121, i8 122, i8 123, i8 124, i8 125, i8 126, i8 127, i8 112 >< 16 x i8 > | align = load <16 x i8>* %a |
Function< 16 x i8 > Produces the following code with | mtriple |
Function< 16 x i8 > Produces the following code with LCPI0_0 toc ha | addis |
Function< 16 x i8 > Produces the following code with LCPI0_0 toc ha LCPI0_1 toc ha | addi |
Function< 16 x i8 > Produces the following code with LCPI0_0 toc ha LCPI0_1 toc ha LCPI0_0 toc l LCPI0_1 toc l | lxvw4x |
Function< 16 x i8 > Produces the following code with LCPI0_0 toc ha LCPI0_1 toc ha LCPI0_0 toc l LCPI0_1 toc l | stxvw4x |
Function< 16 x i8 > Produces the following code with LCPI0_0 toc ha LCPI0_1 toc ha LCPI0_0 toc l LCPI0_1 toc l | ori |
Function< 16 x i8 > Produces the following code with LCPI0_0 toc ha LCPI0_1 toc ha LCPI0_0 toc l LCPI0_1 toc l | vpmsumb |
Function< 16 x i8 > Produces the following code with LCPI0_0 toc ha LCPI0_1 toc ha LCPI0_0 toc l LCPI0_1 toc l blr long quad The two stxvw4x instructions are not needed With the associated permutes are present too The following example is found in test CodeGen PowerPC vec_add_sub_doubleword | ll |
Definition at line 86 of file README_ALTIVEC.txt.
Implement PPCInstrInfo::isLoadFromStackSlot isStoreToStackSlot for vector to generate better spill code The first should be a single lvx from the constant the second should be a xor bar | ( | x | ) |
Referenced by foo().
vector float f | ( | vector float | a, |
vector float | b | ||
) |
Definition at line 200 of file README_ALTIVEC.txt.
float space text globl _test align oris r3 lis stfs r1 addi lfs lo16() r1 lvx r4 lvx r5 vmrghw v2 vspltw vmrghw v3 r2 blr int foo | ( | vector float * | x, |
vector float * | y | ||
) |
Definition at line 137 of file README_ALTIVEC.txt.
void foo | ( | void | ) |
Definition at line 17 of file README_ALTIVEC.txt.
References llvm::support::aligned, bar(), and x.
if | ( | !vec_any_eq *, * | B | ) |
Altivec __pad2__ |
Definition at line 25 of file README_ALTIVEC.txt.
LCPI1_0 __pad3__ |
Definition at line 112 of file README_ALTIVEC.txt.
float space text globl _test align _test |
Definition at line 118 of file README_ALTIVEC.txt.
* A = C |
Definition at line 89 of file README_ALTIVEC.txt.
Referenced by addArgumentAttrs(), addArgumentReturnedAttrs(), llvm::AttributeList::addAttribute(), llvm::AttrBuilder::addAttribute(), llvm::AttributeList::addParamAttribute(), addRawAttributeValue(), addReadAttr(), llvm::AttrBuilder::AttrBuilder(), llvm::coro::buildCoroutineFrame(), CanShareConstantPoolEntry(), llvm::sys::commandLineFitsWithinSystemLimits(), llvm::AttrBuilder::contains(), llvm::ImutAVLFactory< ImutInfo >::createNode(), DbgApplyEqualValues(), determinePointerReadAttrs(), llvm::DominatorTreeBase< BasicBlock, IsPostDom >::dominates(), dumpAllocas(), llvm::MachineFunction::ensureAlignment(), extractSubModule(), f32(), llvm::DominatorTreeBase< BasicBlock, IsPostDom >::findNearestCommonDominator(), llvm::AttributeSetNode::get(), llvm::AttributeSetNode::getAlignment(), llvm::AttributeSetNode::getAllocSizeArgs(), llvm::AttributeSetNode::getAttribute(), llvm::AttributeSetNode::getByRefType(), llvm::AttributeSetNode::getByValType(), llvm::RegionInfoBase< RegionTraits< Function > >::getCommonRegion(), llvm::AttributeSetNode::getDereferenceableBytes(), llvm::AttributeSetNode::getDereferenceableOrNullBytes(), llvm::GraphTraits< ArgumentGraphNode * >::getEntryNode(), llvm::AttributeSetNode::getInAllocaType(), llvm::APIntOps::GetMostSignificantDifferentBit(), llvm::AttributeSetNode::getPreallocatedType(), llvm::slpvectorizer::BoUpSLP::getSpillCost(), llvm::AttributeSetNode::getStackAlignment(), llvm::AttributeSetNode::getStructRetType(), llvm::AttributeSetNode::getVScaleRangeArgs(), llvm::Attribute::getWithAlignment(), llvm::Attribute::getWithStackAlignment(), llvm::AttributeImpl::hasAttribute(), insertSpills(), llvm::SlotIndex::isEarlierEqualInstr(), llvm::SlotIndex::isEarlierInstr(), llvm::DominatorTreeBase< BasicBlock, IsPostDom >::isReachableFromEntry(), llvm::SlotIndex::isSameInstr(), llvm::MachObjectWriter::isSymbolRefDifferenceFullyResolvedImpl(), llvm::DominatorTreeBase< BasicBlock, IsPostDom >::isVirtualRoot(), legalizeCallAttributes(), llvm::HexagonInstr::lessCore(), llvm::HexagonInstr::lessCVI(), LLVMInitializeARMAsmPrinter(), LLVMInitializePowerPCAsmParser(), LLVMInitializeSparcAsmParser(), LLVMInitializeVEAsmParser(), llvm::MachineFunction::makeDebugValueSubstitution(), llvm::ConstantRange::multiply(), operator delete(), llvm::operator!=(), llvm::Attribute::operator<(), llvm::operator==(), llvm::ARM::parseArch(), llvm::ARM::parseArchProfile(), llvm::DominatorTreeBase< BasicBlock, IsPostDom >::properlyDominates(), llvm::AttrBuilder::removeAttribute(), llvm::AttrBuilder::removeAttributes(), llvm::APIntOps::RoundingSDiv(), llvm::APIntOps::RoundingUDiv(), llvm::runIPSCCP(), llvm::MachineBasicBlock::setAlignment(), llvm::MachineFunction::setAlignment(), llvm::GlobalVariable::setAttributes(), sinkSpillUsesAfterCoroBegin(), llvm::ConstantRange::smul_sat(), llvm::APIntOps::SolveQuadraticEquationWrap(), and stripNonValidAttributesFromPrototype().
Function align align store<16 x i8><i8 1, i8 2, i8 3, i8 4, i8 5, i8 6, i8 7, i8 8, i8 9, i8 10, i8 11, i8 12, i8 13, i8 14, i8 15, i8 16><16 x i8> * a |
Definition at line 217 of file README_ALTIVEC.txt.
Referenced by f().
return aa |
Definition at line 197 of file README_ALTIVEC.txt.
Function< 16 x i8 > Produces the following code with LCPI0_0 toc ha LCPI0_1 toc ha LCPI0_0 toc l LCPI0_1 toc l addi |
Definition at line 233 of file README_ALTIVEC.txt.
Definition at line 232 of file README_ALTIVEC.txt.
Function align align store<16 x i8><i8 1, i8 2, i8 3, i8 4, i8 5, i8 6, i8 7, i8 8, i8 9, i8 10, i8 11, i8 12, i8 13, i8 14, i8 15, i8 16><16 x i8> align store<16 x i8><i8 113, i8 114, i8 115, i8 116, i8 117, i8 118, i8 119, i8 120, i8 121, i8 122, i8 123, i8 124, i8 125, i8 126, i8 127, i8 112><16 x i8> align = load <16 x i8>* %a |
Definition at line 219 of file README_ALTIVEC.txt.
Function Attrs |
Definition at line 215 of file README_ALTIVEC.txt.
Referenced by llvm::AttrBuilder::addAlignmentAttr(), llvm::AttrBuilder::addAllocSizeAttrFromRawRepr(), llvm::CallLowering::addArgFlagsFromAttributes(), llvm::GlobalVariable::addAttribute(), llvm::AttributeList::addAttribute(), llvm::AttrBuilder::addAttribute(), llvm::OpenMPIRBuilder::addAttributes(), llvm::Function::addAttributes(), llvm::AttrBuilder::addByRefAttr(), llvm::AttrBuilder::addByValAttr(), llvm::AttrBuilder::addDereferenceableAttr(), llvm::AttrBuilder::addDereferenceableOrNullAttr(), addFramePointerAttrs(), addIfNotExistent(), llvm::AttrBuilder::addInAllocaAttr(), llvm::Function::addParamAttrs(), llvm::AttrBuilder::addPreallocatedAttr(), llvm::AttrBuilder::addStackAlignmentAttr(), llvm::AttrBuilder::addStructRetAttr(), llvm::AttrBuilder::addVScaleRangeAttrFromRawRepr(), llvm::AttrBuilder::clear(), llvm::SITargetLowering::computeKnownAlignForTargetInstr(), llvm::emitBinaryFloatFnCall(), emitBinaryFloatFnCallHelper(), llvm::emitCalloc(), llvm::GraphWriter< GraphType >::emitEdge(), llvm::emitUnaryFloatFnCall(), emitUnaryFloatFnCallHelper(), llvm::AttributeSetNode::get(), llvm::AttributeSet::get(), llvm::AttributeList::get(), llvm::GlobalVariable::getAttribute(), llvm::GlobalVariable::getAttributes(), llvm::GlobalVariable::getAttributesAsList(), llvm::IRPosition::getAttrs(), llvm::IRAttribute< Attribute::Returned, AbstractAttribute >::getDeducedAttributes(), llvm::DOTGraphTraits< DOTFuncInfo * >::getEdgeAttributes(), llvm::TargetLoweringObjectFileELF::getExplicitSectionGlobal(), llvm::GCNTTIImpl::getIntrinsicInstrCost(), llvm::BasicTTIImplBase< AMDGPUTTIImpl >::getIntrinsicInstrCost(), getMethodKind(), llvm::DOTGraphTraits< SelectionDAG * >::getNodeAttributes(), llvm::DOTGraphTraits< DOTFuncInfo * >::getNodeAttributes(), getNodeLabel(), getParameterABIAttributes(), getPointerMode(), getReturnAttrs(), llvm::CallLowering::getReturnInfo(), getSqrtCall(), llvm::BasicTTIImplBase< AMDGPUTTIImpl >::getTypeBasedIntrinsicInstrCost(), handleMethodOverloadList(), handleOneMethod(), handlePointer(), llvm::IRPosition::hasAttr(), llvm::GlobalVariable::hasAttribute(), llvm::GlobalVariable::hasAttributes(), llvm::AttrBuilder::hasAttributes(), llvm::Argument::hasPassPointeeByValueCopyAttr(), llvm::Argument::hasPointeeInMemoryValueAttr(), llvm::InlineFunction(), IsCallReturnTwice(), isIntroVirtual(), isMemberPointer(), llvm::PPCInstrInfo::isSignOrZeroExtended(), LLVMCreateMCJITCompilerForModule(), LLVMGetAttributesAtIndex(), LLVMGetCallSiteAttributes(), llvm::IRAttributeManifest::manifestAttrs(), llvm::AttrBuilder::merge(), llvm::Argument::onlyReadsMemory(), llvm::rdf::operator<<(), llvm::AttrBuilder::operator==(), llvm::AttrBuilder::overlaps(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseEncoding(), llvm::MCSectionMachO::ParseSectionSpecifier(), produceCompactUnwindFrame(), llvm::AttrBuilder::remove(), llvm::AttrBuilder::removeAttribute(), llvm::AttributeSet::removeAttributes(), llvm::Function::removeAttributes(), llvm::Function::removeParamAttrs(), llvm::AttributeList::replaceAttributeType(), llvm::TargetLoweringObjectFile::SectionForGlobal(), llvm::CallLowering::setArgFlags(), llvm::GlobalVariable::setAttributes(), llvm::Function::setAttributes(), llvm::codegen::setFunctionAttributes(), llvm::SelectionDAG::setGraphAttrs(), shouldBeMustTail(), StripAttr(), and llvm::InstCombinerImpl::visitFDiv().
cond_next The vcmpeqfp vcmpeqfp instructions currently cannot be merged when the vcmpeqfp result is used by a branch This can be improved The code generated for this is truly aweful |
Definition at line 108 of file README_ALTIVEC.txt.
Function align align store<16 x i8><i8 1, i8 2, i8 3, i8 4, i8 5, i8 6, i8 7, i8 8, i8 9, i8 10, i8 11, i8 12, i8 13, i8 14, i8 15, i8 16><16 x i8> align store<16 x i8><i8 113, i8 114, i8 115, i8 116, i8 117, i8 118, i8 119, i8 120, i8 121, i8 122, i8 123, i8 124, i8 125, i8 126, i8 127, i8 112><16 x i8> * b |
entry mr r6 r2 blr CodeGen PowerPC vec_constants ll has an and operation that should be codegen d to andc The issue is that the all ones build vector is SelectNodeTo d a VSPLTISB instruction node before the and xor is selected which prevents the vnot pattern from matching An alternative to the store store load approach for illegal insert element lowering would be |
Definition at line 181 of file README_ALTIVEC.txt.
Definition at line 95 of file README_ALTIVEC.txt.
Referenced by llvm::BCBlockRAII::BCBlockRAII(), llvm::GCOVBlock::getCyclesCount(), LLVMSetSuccessor(), and llvm::GCOVFile::readGCDA().
Since we know that Vector is byte aligned and we know the element offset of we should change the load into a lve* x instead of doing a load store lve* x sequence Implement passing vectors by value into calls and receiving them as arguments GCC apparently tries to codegen |
Definition at line 46 of file README_ALTIVEC.txt.
Referenced by LLVMTargetMachineEmit(), LLVMTargetMachineEmitToFile(), and LLVMTargetMachineEmitToMemoryBuffer().
A predicate compare being used in a select_cc should have the same peephole applied to it as a predicate compare used by a br_cc There should be no mfcr oris r5 li li lvx r4 lvx r3 vcmpeqfp v2 mfcr rlwinm cmpwi bne cr0 |
Definition at line 156 of file README_ALTIVEC.txt.
Definition at line 99 of file README_ALTIVEC.txt.
Since we know that Vector is byte aligned and we know the element offset of we should change the load into a lve* x instead of doing a load store lve* x sequence Implement passing vectors by value into calls and receiving them as arguments GCC apparently tries to then a load and vperm of Variable We need a way to teach tblgen that some operands of an intrinsic are required to be constants The verifier should enforce this constraint We currently codegen SCALAR_TO_VECTOR as a store of the scalar to a byte aligned stack followed by a load vperm We should probably just store it to a scalar stack then use lvsl vperm to load it If the value is already in memory this is a big win extract_vector_elt of an arbitrary constant vector can be done with the following vec_ste& destloc |
Definition at line 67 of file README_ALTIVEC.txt.
vperm to rotate result into correct slot vsel result together Should codegen branches on vec_any vec_all to avoid mfcr Two examples |
Definition at line 190 of file README_ALTIVEC.txt.
Definition at line 125 of file README_ALTIVEC.txt.
Definition at line 123 of file README_ALTIVEC.txt.
A predicate compare being used in a select_cc should have the same peephole applied to it as a predicate compare used by a br_cc There should be no mfcr here |
Definition at line 147 of file README_ALTIVEC.txt.
splat index |
Definition at line 181 of file README_ALTIVEC.txt.
Referenced by DecodeMVEVMOVDRegtoQ(), DecodeMVEVMOVQtoDReg(), DecodeVLD1LN(), DecodeVLD2LN(), DecodeVLD3LN(), DecodeVLD4LN(), DecodeVST1LN(), DecodeVST2LN(), DecodeVST3LN(), DecodeVST4LN(), llvm::indexed_accessor_range< DerivedT, BaseT, T, PointerT, ReferenceT >::dereference_iterator(), llvm::yaml::SequenceTraits< ArrayRef< T > >::element(), llvm::LiveRange::expiredAt(), get_amd_kernel_code_t_FieldName(), llvm::getAlign(), llvm::GlobalVariable::getAttributesAsList(), llvm::object::COFFObjectFile::getAuxSymbol(), llvm::object::MachOObjectFile::getBuildToolVersion(), llvm::LTOModule::getDependentLibrary(), llvm::DWARFContext::getDWOUnitAtIndex(), llvm::ARMConstantPoolConstant::getExistingMachineCPValue(), llvm::IndexListEntry::getIndex(), llvm::LiveIntervals::getInstructionFromIndex(), llvm::SlotIndexes::getInstructionFromIndex(), llvm::LiveIntervals::getMBBFromIndex(), llvm::SlotIndexes::getMBBFromIndex(), llvm::object::COFFObjectFile::getSymbol(), llvm::LTOModule::getSymbolAttributes(), llvm::LTOModule::getSymbolGV(), llvm::LTOModule::getSymbolName(), llvm::DWARFContext::getUnitAtIndex(), llvm::SelectionDAG::getVTList(), llvm::IndexListEntry::IndexListEntry(), llvm::LiveRange::liveAt(), LLVMGetParam(), LLVMSetInstrParamAlignment(), LowerCallResult(), llvm::yaml::MappingTraits< ModuleSummaryIndex >::mapping(), llvm::indexed_accessor_range< DerivedT, BaseT, T, PointerT, ReferenceT >::offset_base(), readSIB(), llvm::SlotIndexes::runOnMachineFunction(), llvm::InlineAsm::ConstraintInfo::selectAlternative(), llvm::SelectionDAGISel::SelectCodeCommon(), and llvm::IndexListEntry::setIndex().
Since we know that Vector is byte aligned and we know the element offset of we should change the load into a lve* x instruction |
Definition at line 37 of file README_ALTIVEC.txt.
Since we know that Vector is byte aligned and we know the element offset of we should change the load into a lve* x instead of doing a load store lve* x sequence Implement passing vectors by value into calls and receiving them as arguments GCC apparently tries to then a load and vperm of Variable We need a way to teach tblgen that some operands of an intrinsic are required to be constants The verifier should enforce this constraint We currently codegen SCALAR_TO_VECTOR as a store of the scalar to a byte aligned stack followed by a load vperm We should probably just store it to a scalar stack then use lvsl vperm to load it If the value is already in memory this is a big win extract_vector_elt of an arbitrary constant vector can be done with the following instructions |
Definition at line 66 of file README_ALTIVEC.txt.
We currently generate a but we really shouldn eax ecx xorl edx divl ecx eax divl ecx movl eax ret A similar code sequence works for division We currently compile i32 v2 eax eax jo LBB1_2 LBB1_1 |
Definition at line 159 of file README_ALTIVEC.txt.
Definition at line 99 of file README_ALTIVEC.txt.
Function<16 x i8> Produces the following code with LCPI0_0 toc ha LCPI0_1 toc ha LCPI0_0 toc l LCPI0_1 toc l blr long quad The two stxvw4x instructions are not needed With the associated permutes are present too The following example is found in test CodeGen PowerPC vec_add_sub_doubleword ll |
Definition at line 257 of file README_ALTIVEC.txt.
Function< 16 x i8 > Produces the following code with LCPI0_0 toc ha LCPI0_1 toc ha LCPI0_0 toc l LCPI0_1 toc l lxvw4x |
Definition at line 235 of file README_ALTIVEC.txt.
Function< 16 x i8 > Produces the following code with LCPI0_0 toc ha LCPI0_1 toc ha LCPI0_0 toc l LCPI0_1 toc l blr long quad The two stxvw4x instructions are not needed With mtriple |
Definition at line 230 of file README_ALTIVEC.txt.
Definition at line 120 of file README_ALTIVEC.txt.
We should do a little better with eliminating dead stores The stores to the stack are dead since a and b are not needed |
Definition at line 212 of file README_ALTIVEC.txt.
Altivec not |
Definition at line 28 of file README_ALTIVEC.txt.
Function<16 x i8> Produces the following code with LCPI0_0 toc ha LCPI0_1 toc ha LCPI0_0 toc l LCPI0_1 toc l ori |
Definition at line 239 of file README_ALTIVEC.txt.
Implement PPCInstrInfo::isLoadFromStackSlot isStoreToStackSlot for vector to generate better spill code The first should be a single lvx from the constant pool |
Definition at line 8 of file README_ALTIVEC.txt.
Referenced by llvm::gsym::DwarfTransformer::convert().
Definition at line 122 of file README_ALTIVEC.txt.
A predicate compare being used in a select_cc should have the same peephole applied to it as a predicate compare used by a br_cc There should be no mfcr oris r2 |
Definition at line 119 of file README_ALTIVEC.txt.
entry mr r3 |
Definition at line 119 of file README_ALTIVEC.txt.
A predicate compare being used in a select_cc should have the same peephole applied to it as a predicate compare used by a br_cc There should be no mfcr oris r5 li r5 |
Definition at line 124 of file README_ALTIVEC.txt.
entry mr r6 |
Definition at line 151 of file README_ALTIVEC.txt.
Implement PPCInstrInfo::isLoadFromStackSlot isStoreToStackSlot for vector registers |
Definition at line 4 of file README_ALTIVEC.txt.
Since we know that Vector is byte aligned and we know the element offset of we should change the load into a lve *x instead of doing a load store lve *x sequence Implement passing vectors by value into calls and receiving them as arguments GCC apparently tries to then a load and vperm of Variable We need a way to teach tblgen that some operands of an intrinsic are required to be constants The verifier should enforce this constraint We currently codegen SCALAR_TO_VECTOR as a store of the scalar to a byte aligned stack followed by a load vperm We should probably just store it to a scalar stack slot |
Definition at line 57 of file README_ALTIVEC.txt.
Referenced by llvm::pdb::DIARawSymbol::dump(), and llvm::MachineBasicBlock::printName().
Implement PPCInstrInfo::isLoadFromStackSlot isStoreToStackSlot for vector to generate better spill code The first should be a single lvx from the constant the second should be a xor stvx |
Definition at line 12 of file README_ALTIVEC.txt.
Function< 16 x i8 > Produces the following code with LCPI0_0 toc ha LCPI0_1 toc ha LCPI0_0 toc l LCPI0_1 toc l stxvw4x |
Definition at line 238 of file README_ALTIVEC.txt.
Definition at line 33 of file README_ALTIVEC.txt.
A predicate compare being used in a select_cc should have the same peephole applied to it as a predicate compare used by a br_cc There should be no mfcr oris r5 li li lvx r4 lvx r3 vcmpeqfp v2 |
Definition at line 98 of file README_ALTIVEC.txt.
A predicate compare being used in a select_cc should have the same peephole applied to it as a predicate compare used by a br_cc There should be no mfcr oris r5 li li lvx r4 lvx r3 vcmpeqfp v3 |
Definition at line 95 of file README_ALTIVEC.txt.
Definition at line 34 of file README_ALTIVEC.txt.
Function<16 x i8> Produces the following code with LCPI0_0 toc ha LCPI0_1 toc ha LCPI0_0 toc l LCPI0_1 toc l vpmsumb |
Definition at line 243 of file README_ALTIVEC.txt.
Instead of the following for memset char edx edx edx It might be better to generate eax movl edx movl edx movw edx when we can spare a register It reduces code size Evaluate what the best way to codegen sdiv C is For X |
Definition at line 37 of file README_ALTIVEC.txt.
Referenced by llvm::ImmutableList< T >::begin(), llvm::ImmutableList< T >::getHead(), llvm::ImmutableList< T >::getInternalPointer(), llvm::ImmutableList< T >::getTail(), llvm::ImmutableList< T >::ImmutableList(), llvm::SmallBitVector::invalid(), llvm::ImmutableList< T >::isEmpty(), llvm::ImmutableList< T >::isEqual(), llvm::SmallBitVector::isInvalid(), llvm::SmallBitVector::isSmall(), llvm::SmallBitVector::operator=(), llvm::ImmutableList< T >::Profile(), llvm::SaveAndRestore< T >::SaveAndRestore(), llvm::SmallBitVector::SmallBitVector(), and llvm::SmallBitVector::swap().