LLVM  13.0.0git
Functions | Variables
lib/Target/SystemZ/README.txt File Reference

Functions

The initial backend is deliberately restricted to z10 We should add support for later architectures at some point If an asm ties an i32 r result to an i64 the input will be treated as an leaving the upper bits uninitialised For i64 store i32 i32 *dst ret void from CodeGen SystemZ asm ll will use LHI rather than LGHI to load This seems to be a general target independent problem The tuning of the choice between LOAD ADDRESS (LA) and addition in SystemZISelDAGToDAG.cpp is suspect. It should be tweaked based on performance measurements. -- There is no scheduling support. -- We don 't use the BRANCH ON INDEX instructions. -- We only use MVC
 

Variables

The initial backend is deliberately restricted to z10 We should add support for later architectures at some point If an asm ties an i32 r result to an i64 input
 
The initial backend is deliberately restricted to z10 We should add support for later architectures at some point If an asm ties an i32 r result to an i64 the input will be treated as an i32
 
The initial backend is deliberately restricted to z10 We should add support for later architectures at some point If an asm ties an i32 r result to an i64 the input will be treated as an leaving the upper bits uninitialised For example
 
The initial backend is deliberately restricted to z10 We should add support for later architectures at some point If an asm ties an i32 r result to an i64 the input will be treated as an leaving the upper bits uninitialised For i64 store i32 val
 
The initial backend is deliberately restricted to z10 We should add support for later architectures at some point If an asm ties an i32 r result to an i64 the input will be treated as an leaving the upper bits uninitialised For i64 store i32 i32 *dst ret void from CodeGen SystemZ asm ll will use LHI rather than LGHI to load This seems to be a general target independent problem The tuning of the choice between LOAD XC and CLC for constant length block operations We could extend them to variable length operations too
 
therefore end up as
 
therefore end up r2
 
therefore end up llgh r0
 
therefore end up llgh r3 lr r0 br r14 but truncating the load would give
 
therefore end up llgh r3 lr r0 br r14 but truncating the load would lh r3 br r14 Functions like
 
therefore end up llgh r3 lr r0 br r14 but truncating the load would lh r3 br r14 Functions ret i64 and ought to be implemented ngr r0 br r14 but two address optimizations reverse the order of the AND and force
 
therefore end up llgh r3 lr r0 br r14 but truncating the load would lh r3 br r14 Functions ret i64 and ought to be implemented ngr r0 br r14 but two address optimizations reverse the order of the AND and ngr r2 lgr r0 br r14 CodeGen SystemZ and ll has several examples of this Out of range displacements are usually handled by loading the full address into a register In many cases it would be better to create an anchor point instead E g for
 
therefore end up llgh r3 lr r0 br r14 but truncating the load would lh r3 br r14 Functions ret i64 and ought to be implemented ngr r0 br r14 but two address optimizations reverse the order of the AND and ngr r2 lgr r0 br r14 CodeGen SystemZ and ll has several examples of this Out of range displacements are usually handled by loading the full address into a register In many cases it would be better to create an anchor point instead E g i64 base
 

Function Documentation

◆ ADDRESS()

The initial backend is deliberately restricted to z10 We should add support for later architectures at some point If an asm ties an i32 r result to an i64 the input will be treated as an leaving the upper bits uninitialised For i64 store i32 i32* dst ret void from CodeGen SystemZ asm ll will use LHI rather than LGHI to load This seems to be a general target independent problem The tuning of the choice between LOAD ADDRESS ( LA  )

Variable Documentation

◆ as

therefore end up llgh r3 lr r0 br r14 but truncating the load would lh r3 br r14 Functions ret i64 and ought to be implemented as

Definition at line 84 of file README.txt.

◆ base

therefore end up llgh r3 lr r0 br r14 but truncating the load would lh r3 br r14 Functions ret i64 and ought to be implemented ngr r0 br r14 but two address optimizations reverse the order of the AND and ngr r2 lgr r0 br r14 CodeGen SystemZ and ll has several examples of this Out of range displacements are usually handled by loading the full address into a register In many cases it would be better to create an anchor point instead E g i64 base

◆ example

The initial backend is deliberately restricted to z10 We should add support for later architectures at some point If an asm ties an i32 r result to an i64 the input will be treated as an leaving the upper bits uninitialised For example

Definition at line 15 of file README.txt.

◆ for

therefore end up llgh r3 lr r0 br r14 but truncating the load would lh r3 br r14 Functions ret i64 and ought to be implemented ngr r0 br r14 but two address optimizations reverse the order of the AND and ngr r2 lgr r0 br r14 CodeGen SystemZ and ll has several examples of this Out of range displacements are usually handled by loading the full address into a register In many cases it would be better to create an anchor point instead E g for

Definition at line 125 of file README.txt.

◆ force

therefore end up llgh r3 lr r0 br r14 but truncating the load would lh r3 br r14 Functions ret i64 and ought to be implemented ngr r0 br r14 but two address optimizations reverse the order of the AND and force

◆ give

therefore end up llgh r3 lr r0 br r14 but truncating the load would give

Definition at line 91 of file README.txt.

◆ i32

The initial backend is deliberately restricted to z10 We should add support for later architectures at some point If an asm ties an i32 r result to an i64 the input will be treated as an i32

Definition at line 11 of file README.txt.

◆ input

The initial backend is deliberately restricted to z10 We should add support for later architectures at some point If an asm ties an i32 r result to an i64 input
inline

◆ like

therefore end up llgh r3 lr r0 br r14 but truncating the load would lh r3 br r14 Functions like

Definition at line 100 of file README.txt.

◆ r0

therefore end up llgh r3 lr r0 br r14 but truncating the load would lh r3 br r14 Functions ret i64 and ought to be implemented ngr r0 br r14 but two address optimizations reverse the order of the AND and ngr r0

Definition at line 85 of file README.txt.

◆ r2

therefore end up llgh r3 lr r0 br r14 but truncating the load would lh r3 br r14 Functions ret i64 and ought to be implemented ngr r0 br r14 but two address optimizations reverse the order of the AND and ngr r2 lgr r2

Definition at line 84 of file README.txt.

◆ too

The initial backend is deliberately restricted to z10 We should add support for later architectures at some point If an asm ties an i32 r result to an i64 the input will be treated as an leaving the upper bits uninitialised For i64 store i32 i32* dst ret void from CodeGen SystemZ asm ll will use LHI rather than LGHI to load This seems to be a general target independent problem The tuning of the choice between LOAD XC and CLC for constant length block operations We could extend them to variable length operations too

Definition at line 40 of file README.txt.

◆ val

The initial backend is deliberately restricted to z10 We should add support for later architectures at some point If an asm ties an i32 r result to an i64 the input will be treated as an leaving the upper bits uninitialised For i64 store i32 val

Definition at line 15 of file README.txt.

Referenced by llvm::AVR::fixups::adjustBranchTarget(), llvm::APInt::APInt(), ConvertDoubleToBytes(), ConvertFloatToBytes(), ConvertIntToBytes(), llvm::ScopedHashTableVal< K, V >::Create(), llvm::AArch64_AM::decodeLogicalImmediate(), emitIndirectDst(), executeFCMP_BOOL(), llvm::R600InstrInfo::expandPostRAPseudo(), llvm::DenseMapInfo< omp::TraitProperty >::getHashValue(), llvm::DenseMapInfo< hash_code >::getHashValue(), llvm::getSamplerName(), llvm::getSurfaceName(), llvm::getTextureName(), llvm::PackedVectorBase< T, BitNum, BitVectorTy, false >::getValue(), llvm::PackedVectorBase< T, BitNum, BitVectorTy, true >::getValue(), llvm::ARM_PROC::IFlagsToString(), llvm::ARM_PROC::IModToString(), llvm::ARM_ISB::InstSyncBOptToString(), IsConstantOne(), llvm::isImage(), llvm::isImageReadOnly(), llvm::isImageReadWrite(), llvm::isImageWriteOnly(), isImmMskBitp(), isImmU16(), isImmU6(), isImmUs(), isImmUs2(), isImmUs4(), llvm::isManaged(), llvm::isSampler(), llvm::isSurface(), llvm::isTexture(), llvm::AArch64_AM::isValidDecodeLogicalImmediate(), lle_X_memset(), matchPairwiseShuffleMask(), llvm::ARM_MB::MemBOptToString(), llvm::PackedVector< T, BitNum, BitVectorTy >::reference::operator=(), llvm::GVNExpression::op_inserter::operator=(), llvm::GVNExpression::int_op_inserter::operator=(), llvm::support::detail::packed_endian_specific_integral< ProcessorArchitecture >::packed_endian_specific_integral(), llvm::ARMTargetLowering::PerformIntrinsicCombine(), llvm::ARMInstPrinter::printInstSyncBOption(), llvm::ARMInstPrinter::printMemBOption(), llvm::ARMInstPrinter::printTraceSyncBOption(), llvm::PackedVector< T, BitNum, BitVectorTy >::push_back(), llvm::iplist_impl< simple_ilist< MachineInstr, Options... >, ilist_traits< MachineInstr > >::push_back(), llvm::iplist_impl< simple_ilist< MachineInstr, Options... >, ilist_traits< MachineInstr > >::push_front(), llvm::support::endian::readAtBitAlignment(), llvm::hashing::detail::rotate(), llvm::LLLexer::setIgnoreColonInIdentifiers(), llvm::HexagonPacketizerList::setmemShufDisabled(), llvm::MCAsmLexer::setSkipSpace(), llvm::Type::setSubclassData(), llvm::PackedVectorBase< T, BitNum, BitVectorTy, false >::setValue(), llvm::PackedVectorBase< T, BitNum, BitVectorTy, true >::setValue(), llvm::hashing::detail::shift_mix(), llvm::StoreInst::StoreInst(), llvm::ARM_TSB::TraceSyncBOptToString(), and llvm::support::endian::writeAtBitAlignment().

store
Common register allocation spilling lr str ldr sxth r3 ldr mla r4 can lr mov lr str ldr sxth r3 mla r4 and then merge mul and lr str ldr sxth r3 mla r4 It also increase the likelihood the store may become dead bb27 Successors according to LLVM ID Predecessors according to mbb< bb27, 0x8b0a7c0 > Note ADDri is not a two address instruction its result reg1037 is an operand of the PHI node in bb76 and its operand reg1039 is the result of the PHI node We should treat it as a two address code and make sure the ADDri is scheduled after any node that reads reg1039 Use info(i.e. register scavenger) to assign it a free register to allow reuse the collector could move the objects and invalidate the derived pointer This is bad enough in the first but safe points can crop up unpredictably **array_addr i32 n y store obj obj **nth_el If the i64 division is lowered to a then a safe point array and nth_el no longer point into the correct object The fix for this is to copy address calculations so that dependent pointers are never live across safe point boundaries But the loads cannot be copied like this if there was an intervening store
Definition: README.txt:133
to
Should compile to
Definition: README.txt:449
ret
to esp esp setne al movzbw ax esp setg cl movzbw cx cmove cx cl jne LBB1_2 esp ret(also really horrible code on ppc). This is due to the expand code for 64-bit compares. GCC produces multiple branches
and
We currently generate a but we really shouldn eax ecx xorl edx divl ecx eax divl ecx movl eax ret A similar code sequence works for division We currently compile i32 v2 eax eax jo LBB1_2 and
Definition: README.txt:1271
a
=0.0 ? 0.0 :(a > 0.0 ? 1.0 :-1.0) a
Definition: README.txt:489
i64
Clang compiles this i64
Definition: README.txt:504
b
the resulting code requires compare and branches when and if the revised code is with conditional branches instead of More there is a byte word extend before each where there should be only and the condition codes are not remembered when the same two values are compared twice More LSR enhancements i8 and i32 load store addressing modes are identical int b
Definition: README.txt:418
into
Clang compiles this into
Definition: README.txt:504
load
LLVM currently emits rax rax movq rax rax ret It could narrow the loads and stores to emit rax rax movq rax rax ret The trouble is that there is a TokenFactor between the store and the load
Definition: README.txt:1531
base
therefore end up llgh r3 lr r0 br r14 but truncating the load would lh r3 br r14 Functions ret i64 and ought to be implemented ngr r0 br r14 but two address optimizations reverse the order of the AND and ngr r2 lgr r0 br r14 CodeGen SystemZ and ll has several examples of this Out of range displacements are usually handled by loading the full address into a register In many cases it would be better to create an anchor point instead E g i64 base
Definition: README.txt:125
add
we currently eax ecx subl eax ret We would use one fewer register if codegen d eax neg eax add
Definition: README.txt:454
registers
Implement PPCInstrInfo::isLoadFromStackSlot isStoreToStackSlot for vector registers
Definition: README_ALTIVEC.txt:4