LLVM  7.0.0svn
X86ISelLowering.h
Go to the documentation of this file.
1 //===-- X86ISelLowering.h - X86 DAG Lowering Interface ----------*- C++ -*-===//
2 //
3 // The LLVM Compiler Infrastructure
4 //
5 // This file is distributed under the University of Illinois Open Source
6 // License. See LICENSE.TXT for details.
7 //
8 //===----------------------------------------------------------------------===//
9 //
10 // This file defines the interfaces that X86 uses to lower LLVM code into a
11 // selection DAG.
12 //
13 //===----------------------------------------------------------------------===//
14 
15 #ifndef LLVM_LIB_TARGET_X86_X86ISELLOWERING_H
16 #define LLVM_LIB_TARGET_X86_X86ISELLOWERING_H
17 
22 
23 namespace llvm {
24  class X86Subtarget;
25  class X86TargetMachine;
26 
27  namespace X86ISD {
28  // X86 Specific DAG Nodes
29  enum NodeType : unsigned {
30  // Start the numbering where the builtin ops leave off.
32 
33  /// Bit scan forward.
34  BSF,
35  /// Bit scan reverse.
36  BSR,
37 
38  /// Double shift instructions. These correspond to
39  /// X86::SHLDxx and X86::SHRDxx instructions.
42 
43  /// Bitwise logical AND of floating point values. This corresponds
44  /// to X86::ANDPS or X86::ANDPD.
46 
47  /// Bitwise logical OR of floating point values. This corresponds
48  /// to X86::ORPS or X86::ORPD.
49  FOR,
50 
51  /// Bitwise logical XOR of floating point values. This corresponds
52  /// to X86::XORPS or X86::XORPD.
54 
55  /// Bitwise logical ANDNOT of floating point values. This
56  /// corresponds to X86::ANDNPS or X86::ANDNPD.
58 
59  /// These operations represent an abstract X86 call
60  /// instruction, which includes a bunch of information. In particular the
61  /// operands of these node are:
62  ///
63  /// #0 - The incoming token chain
64  /// #1 - The callee
65  /// #2 - The number of arg bytes the caller pushes on the stack.
66  /// #3 - The number of arg bytes the callee pops off the stack.
67  /// #4 - The value to pass in AL/AX/EAX (optional)
68  /// #5 - The value to pass in DL/DX/EDX (optional)
69  ///
70  /// The result values of these nodes are:
71  ///
72  /// #0 - The outgoing token chain
73  /// #1 - The first register result value (optional)
74  /// #2 - The second register result value (optional)
75  ///
77 
78  /// This operation implements the lowering for readcyclecounter.
80 
81  /// X86 Read Time-Stamp Counter and Processor ID.
83 
84  /// X86 Read Performance Monitoring Counters.
86 
87  /// X86 compare and logical compare instructions.
89 
90  /// X86 bit-test instructions.
91  BT,
92 
93  /// X86 SetCC. Operand 0 is condition code, and operand 1 is the EFLAGS
94  /// operand, usually produced by a CMP instruction.
96 
97  /// X86 Select
99 
100  // Same as SETCC except it's materialized with a sbb and the value is all
101  // one's or all zero's.
102  SETCC_CARRY, // R = carry_bit ? ~0 : 0
103 
104  /// X86 FP SETCC, implemented with CMP{cc}SS/CMP{cc}SD.
105  /// Operands are two FP values to compare; result is a mask of
106  /// 0s or 1s. Generally DTRT for C/C++ with NaNs.
108 
109  /// X86 FP SETCC, similar to above, but with output as an i1 mask and
110  /// with optional rounding mode.
112 
113  /// X86 conditional moves. Operand 0 and operand 1 are the two values
114  /// to select from. Operand 2 is the condition code, and operand 3 is the
115  /// flag operand produced by a CMP or TEST instruction. It also writes a
116  /// flag result.
118 
119  /// X86 conditional branches. Operand 0 is the chain operand, operand 1
120  /// is the block to branch if condition is true, operand 2 is the
121  /// condition code, and operand 3 is the flag operand produced by a CMP
122  /// or TEST instruction.
124 
125  /// Return with a flag operand. Operand 0 is the chain operand, operand
126  /// 1 is the number of bytes of stack to pop.
128 
129  /// Return from interrupt. Operand 0 is the number of bytes to pop.
131 
132  /// Repeat fill, corresponds to X86::REP_STOSx.
134 
135  /// Repeat move, corresponds to X86::REP_MOVSx.
137 
138  /// On Darwin, this node represents the result of the popl
139  /// at function entry, used for PIC code.
141 
142  /// A wrapper node for TargetConstantPool, TargetJumpTable,
143  /// TargetExternalSymbol, TargetGlobalAddress, TargetGlobalTLSAddress,
144  /// MCSymbol and TargetBlockAddress.
146 
147  /// Special wrapper used under X86-64 PIC mode for RIP
148  /// relative displacements.
150 
151  /// Copies a 64-bit value from the low word of an XMM vector
152  /// to an MMX vector.
154 
155  /// Copies a 32-bit value from the low word of a MMX
156  /// vector to a GPR.
158 
159  /// Copies a GPR into the low 32-bit word of a MMX vector
160  /// and zero out the high word.
162 
163  /// Extract an 8-bit value from a vector and zero extend it to
164  /// i32, corresponds to X86::PEXTRB.
166 
167  /// Extract a 16-bit value from a vector and zero extend it to
168  /// i32, corresponds to X86::PEXTRW.
170 
171  /// Insert any element of a 4 x float vector into any element
172  /// of a destination 4 x floatvector.
174 
175  /// Insert the lower 8-bits of a 32-bit value to a vector,
176  /// corresponds to X86::PINSRB.
178 
179  /// Insert the lower 16-bits of a 32-bit value to a vector,
180  /// corresponds to X86::PINSRW.
182 
183  /// Shuffle 16 8-bit values within a vector.
185 
186  /// Compute Sum of Absolute Differences.
188  /// Compute Double Block Packed Sum-Absolute-Differences
190 
191  /// Bitwise Logical AND NOT of Packed FP values.
193 
194  /// Blend where the selector is an immediate.
196 
197  /// Dynamic (non-constant condition) vector blend where only the sign bits
198  /// of the condition elements are used. This is used to enforce that the
199  /// condition mask is not valid for generic VSELECT optimizations.
201 
202  /// Combined add and sub on an FP vector.
204 
205  // FP vector ops with rounding mode.
213 
214  // FP vector get exponent.
216  // Extract Normalized Mantissas.
218  // FP Scale.
221 
222  // Integer add/sub with unsigned saturation.
225 
226  // Integer add/sub with signed saturation.
229 
230  // Unsigned Integer average.
232 
233  /// Integer horizontal add/sub.
236 
237  /// Floating point horizontal add/sub.
240 
241  // Detect Conflicts Within a Vector
243 
244  /// Floating point max and min.
246 
247  /// Commutative FMIN and FMAX.
249 
250  /// Scalar intrinsic floating point max and min.
252 
253  /// Floating point reciprocal-sqrt and reciprocal approximation.
254  /// Note that these typically require refinement
255  /// in order to obtain suitable precision.
257 
258  // AVX-512 reciprocal approximations with a little more precision.
260 
261  // Thread Local Storage.
263 
264  // Thread Local Storage. A call to get the start address
265  // of the TLS block for the current module.
267 
268  // Thread Local Storage. When calling to an OS provided
269  // thunk at the address from an earlier relocation.
271 
272  // Exception Handling helpers.
274 
275  // SjLj exception handling setjmp.
277 
278  // SjLj exception handling longjmp.
280 
281  // SjLj exception handling dispatch.
283 
284  /// Tail call return. See X86TargetLowering::LowerCall for
285  /// the list of operands.
287 
288  // Vector move to low scalar and zero higher vector elements.
290 
291  // Vector integer zero-extend.
293  // Vector integer signed-extend.
295 
296  // Vector integer truncate.
298  // Vector integer truncate with unsigned/signed saturation.
300 
301  // Vector FP extend.
303 
304  // Vector FP round.
306 
307  // 128-bit vector logical left / right shift
309 
310  // Vector shift elements
312 
313  // Vector variable shift right arithmetic.
314  // Unlike ISD::SRA, in case shift count greater then element size
315  // use sign bit to fill destination data element.
317 
318  // Vector shift elements by immediate
320 
321  // Shifts of mask registers.
323 
324  // Bit rotate by immediate
326 
327  // Vector packed double/float comparison.
329 
330  // Vector integer comparisons.
332  // Vector integer comparisons, the result is in a mask vector.
334 
335  // v8i16 Horizontal minimum and position.
337 
339 
340  /// Vector comparison generating mask bits for fp and
341  /// integer signed and unsigned data types.
344  // Vector comparison with rounding mode for FP values
346 
347  // Arithmetic operations with FLAGS results.
349  INC, DEC, OR, XOR, AND,
350 
351  // LOW, HI, FLAGS = umul LHS, RHS.
353 
354  // 8-bit SMUL/UMUL - AX, FLAGS = smul8/umul8 AL, RHS.
356 
357  // 8-bit divrem that zero-extend the high result (AH).
360 
361  // X86-specific multiply by immediate.
363 
364  // Vector sign bit extraction.
366 
367  // Vector bitwise comparisons.
369 
370  // Vector packed fp sign bitwise comparisons.
372 
373  // Vector "test" in AVX-512, the result is in a mask vector.
376 
377  // OR/AND test for masks.
380 
381  // Several flavors of instructions with vector shuffle behaviors.
382  // Saturated signed/unnsigned packing.
385  // Intra-lane alignr.
387  // AVX512 inter-lane alignr.
393  // VBMI2 Concat & Shift.
398  //Shuffle Packed Values at 128-bit granularity.
415 
416  // Variable Permute (VPERM).
417  // Res = VPERMV MaskV, V0
419 
420  // 3-op Variable Permute (VPERMT2).
421  // Res = VPERMV3 V0, MaskV, V1
423 
424  // 3-op Variable Permute overwriting the index (VPERMI2).
425  // Res = VPERMIV3 V0, MaskV, V1
427 
428  // Bitwise ternary logic.
430  // Fix Up Special Packed Float32/64 values.
433  // Range Restriction Calculation For Packed Pairs of Float32/64 values.
435  // Reduce - Perform Reduction Transformation on scalar\packed FP.
437  // RndScale - Round FP Values To Include A Given Number Of Fraction Bits.
438  // Also used by the legacy (V)ROUND intrinsics where we mask out the
439  // scaling part of the immediate.
441  // Tests Types Of a FP Values for packed types.
443  // Tests Types Of a FP Values for scalar types.
445 
446  // Broadcast scalar to vector.
448  // Broadcast mask to vector.
450  // Broadcast subvector to vector.
452 
453  /// SSE4A Extraction and Insertion.
455 
456  // XOP arithmetic/logical shifts.
458  // XOP signed/unsigned integer comparisons.
460  // XOP packed permute bytes.
462  // XOP two source permutation.
464 
465  // Vector multiply packed unsigned doubleword integers.
467  // Vector multiply packed signed doubleword integers.
469  // Vector Multiply Packed UnsignedIntegers with Round and Scale.
471 
472  // Multiply and Add Packed Integers.
474 
475  // AVX512IFMA multiply and add.
476  // NOTE: These are different than the instruction and perform
477  // op0 x op1 + op2.
479 
480  // VNNI
485 
486  // FMA nodes.
487  // We use the target independent ISD::FMA for the non-inverted case.
493 
494  // FMA with rounding mode.
501 
502  // FMA4 specific scalar intrinsics bits that zero the non-scalar bits.
504 
505  // Scalar intrinsic FMA.
510 
511  // Scalar intrinsic FMA with rounding mode.
512  // Two versions, passthru bits on op1 or op3.
517 
518  // Compress and expand.
521 
522  // Bits shuffle
524 
525  // Convert Unsigned/Integer to Floating-Point Value with rounding mode.
528 
529  // Vector float/double to signed/unsigned integer.
531  // Scalar float/double to signed/unsigned integer.
533 
534  // Vector float/double to signed/unsigned integer with truncation.
536  // Scalar float/double to signed/unsigned integer with truncation.
538 
539  // Vector signed/unsigned integer to float/double.
541 
542  // Save xmm argument registers to the stack, according to %al. An operator
543  // is needed so that this can be expanded with control flow.
545 
546  // Windows's _chkstk call to do stack probing.
548 
549  // For allocating variable amounts of stack space when using
550  // segmented stacks. Check if the current stacklet has enough space, and
551  // falls back to heap allocation if not.
553 
554  // Memory barriers.
557 
558  // Store FP status word into i16 register.
560 
561  // Store contents of %ah into %eflags.
563 
564  // Get a random integer and indicate whether it is valid in CF.
566 
567  // Get a NIST SP800-90B & C compliant random integer and
568  // indicate whether it is valid in CF.
570 
571  // SSE42 string comparisons.
574 
575  // Test if in transactional execution.
577 
578  // ERI instructions.
580 
581  // Conversions between float and half-float.
583 
584  // Galois Field Arithmetic Instructions
586 
587  // LWP insert record.
589 
590  // Compare and swap.
596 
597  /// LOCK-prefixed arithmetic read-modify-write instructions.
598  /// EFLAGS, OUTCHAIN = LADD(INCHAIN, PTR, RHS)
600 
601  // Load, scalar_to_vector, and zero extend.
603 
604  // Store FP control world into i16 memory.
606 
607  /// This instruction implements FP_TO_SINT with the
608  /// integer destination in memory and a FP reg source. This corresponds
609  /// to the X86::FIST*m instructions and the rounding mode change stuff. It
610  /// has two inputs (token chain and address) and two outputs (int value
611  /// and token chain).
615 
616  /// This instruction implements SINT_TO_FP with the
617  /// integer source in memory and FP reg result. This corresponds to the
618  /// X86::FILD*m instructions. It has three inputs (token chain, address,
619  /// and source type) and two outputs (FP value and token chain). FILD_FLAG
620  /// also produces a flag).
623 
624  /// This instruction implements an extending load to FP stack slots.
625  /// This corresponds to the X86::FLD32m / X86::FLD64m. It takes a chain
626  /// operand, ptr to load from, and a ValueType node indicating the type
627  /// to load to.
629 
630  /// This instruction implements a truncating store to FP stack
631  /// slots. This corresponds to the X86::FST32m / X86::FST64m. It takes a
632  /// chain operand, value to store, address, and a ValueType to store it
633  /// as.
635 
636  /// This instruction grabs the address of the next argument
637  /// from a va_list. (reads and modifies the va_list in memory)
639 
640  // Vector truncating store with unsigned/signed saturation
642  // Vector truncating masked store with unsigned/signed saturation
644 
645  // X86 specific gather and scatter
647 
648  // WARNING: Do not add anything in the end unless you want the node to
649  // have memop! In fact, starting from FIRST_TARGET_MEMORY_OPCODE all
650  // opcodes will be thought as target memory ops!
651  };
652  } // end namespace X86ISD
653 
654  /// Define some predicates that are used for node matching.
655  namespace X86 {
656  /// Returns true if Elt is a constant zero or floating point constant +0.0.
657  bool isZeroNode(SDValue Elt);
658 
659  /// Returns true of the given offset can be
660  /// fit into displacement field of the instruction.
662  bool hasSymbolicDisplacement = true);
663 
664  /// Determines whether the callee is required to pop its
665  /// own arguments. Callee pop is necessary to support tail calls.
666  bool isCalleePop(CallingConv::ID CallingConv,
667  bool is64Bit, bool IsVarArg, bool GuaranteeTCO);
668 
669  } // end namespace X86
670 
671  //===--------------------------------------------------------------------===//
672  // X86 Implementation of the TargetLowering interface
673  class X86TargetLowering final : public TargetLowering {
674  public:
675  explicit X86TargetLowering(const X86TargetMachine &TM,
676  const X86Subtarget &STI);
677 
678  unsigned getJumpTableEncoding() const override;
679  bool useSoftFloat() const override;
680 
681  void markLibCallAttributes(MachineFunction *MF, unsigned CC,
682  ArgListTy &Args) const override;
683 
684  MVT getScalarShiftAmountTy(const DataLayout &, EVT VT) const override {
685  return MVT::i8;
686  }
687 
688  const MCExpr *
689  LowerCustomJumpTableEntry(const MachineJumpTableInfo *MJTI,
690  const MachineBasicBlock *MBB, unsigned uid,
691  MCContext &Ctx) const override;
692 
693  /// Returns relocation base for the given PIC jumptable.
694  SDValue getPICJumpTableRelocBase(SDValue Table,
695  SelectionDAG &DAG) const override;
696  const MCExpr *
697  getPICJumpTableRelocBaseExpr(const MachineFunction *MF,
698  unsigned JTI, MCContext &Ctx) const override;
699 
700  /// Return the desired alignment for ByVal aggregate
701  /// function arguments in the caller parameter area. For X86, aggregates
702  /// that contains are placed at 16-byte boundaries while the rest are at
703  /// 4-byte boundaries.
704  unsigned getByValTypeAlignment(Type *Ty,
705  const DataLayout &DL) const override;
706 
707  /// Returns the target specific optimal type for load
708  /// and store operations as a result of memset, memcpy, and memmove
709  /// lowering. If DstAlign is zero that means it's safe to destination
710  /// alignment can satisfy any constraint. Similarly if SrcAlign is zero it
711  /// means there isn't a need to check it against alignment requirement,
712  /// probably because the source does not need to be loaded. If 'IsMemset' is
713  /// true, that means it's expanding a memset. If 'ZeroMemset' is true, that
714  /// means it's a memset of zero. 'MemcpyStrSrc' indicates whether the memcpy
715  /// source is constant so it does not need to be loaded.
716  /// It returns EVT::Other if the type should be determined using generic
717  /// target-independent logic.
718  EVT getOptimalMemOpType(uint64_t Size, unsigned DstAlign, unsigned SrcAlign,
719  bool IsMemset, bool ZeroMemset, bool MemcpyStrSrc,
720  MachineFunction &MF) const override;
721 
722  /// Returns true if it's safe to use load / store of the
723  /// specified type to expand memcpy / memset inline. This is mostly true
724  /// for all types except for some special cases. For example, on X86
725  /// targets without SSE2 f64 load / store are done with fldl / fstpl which
726  /// also does type conversion. Note the specified type doesn't have to be
727  /// legal as the hook is used before type legalization.
728  bool isSafeMemOpType(MVT VT) const override;
729 
730  /// Returns true if the target allows unaligned memory accesses of the
731  /// specified type. Returns whether it is "fast" in the last argument.
732  bool allowsMisalignedMemoryAccesses(EVT VT, unsigned AS, unsigned Align,
733  bool *Fast) const override;
734 
735  /// Provide custom lowering hooks for some operations.
736  ///
737  SDValue LowerOperation(SDValue Op, SelectionDAG &DAG) const override;
738 
739  /// Places new result values for the node in Results (their number
740  /// and types must exactly match those of the original return values of
741  /// the node), or leaves Results empty, which indicates that the node is not
742  /// to be custom lowered after all.
743  void LowerOperationWrapper(SDNode *N,
745  SelectionDAG &DAG) const override;
746 
747  /// Replace the results of node with an illegal result
748  /// type with new values built out of custom code.
749  ///
750  void ReplaceNodeResults(SDNode *N, SmallVectorImpl<SDValue>&Results,
751  SelectionDAG &DAG) const override;
752 
753  SDValue PerformDAGCombine(SDNode *N, DAGCombinerInfo &DCI) const override;
754 
755  // Return true if it is profitable to combine a BUILD_VECTOR with a
756  // stride-pattern to a shuffle and a truncate.
757  // Example of such a combine:
758  // v4i32 build_vector((extract_elt V, 1),
759  // (extract_elt V, 3),
760  // (extract_elt V, 5),
761  // (extract_elt V, 7))
762  // -->
763  // v4i32 truncate (bitcast (shuffle<1,u,3,u,4,u,5,u,6,u,7,u> V, u) to
764  // v4i64)
765  bool isDesirableToCombineBuildVectorToShuffleTruncate(
766  ArrayRef<int> ShuffleMask, EVT SrcVT, EVT TruncVT) const override;
767 
768  /// Return true if the target has native support for
769  /// the specified value type and it is 'desirable' to use the type for the
770  /// given node type. e.g. On x86 i16 is legal, but undesirable since i16
771  /// instruction encodings are longer and some i16 instructions are slow.
772  bool isTypeDesirableForOp(unsigned Opc, EVT VT) const override;
773 
774  /// Return true if the target has native support for the
775  /// specified value type and it is 'desirable' to use the type. e.g. On x86
776  /// i16 is legal, but undesirable since i16 instruction encodings are longer
777  /// and some i16 instructions are slow.
778  bool IsDesirableToPromoteOp(SDValue Op, EVT &PVT) const override;
779 
781  EmitInstrWithCustomInserter(MachineInstr &MI,
782  MachineBasicBlock *MBB) const override;
783 
784  /// This method returns the name of a target specific DAG node.
785  const char *getTargetNodeName(unsigned Opcode) const override;
786 
787  bool mergeStoresAfterLegalization() const override { return true; }
788 
789  bool canMergeStoresTo(unsigned AddressSpace, EVT MemVT,
790  const SelectionDAG &DAG) const override;
791 
792  bool isCheapToSpeculateCttz() const override;
793 
794  bool isCheapToSpeculateCtlz() const override;
795 
796  bool isCtlzFast() const override;
797 
798  bool hasBitPreservingFPLogic(EVT VT) const override {
799  return VT == MVT::f32 || VT == MVT::f64 || VT.isVector();
800  }
801 
802  bool isMultiStoresCheaperThanBitsMerge(EVT LTy, EVT HTy) const override {
803  // If the pair to store is a mixture of float and int values, we will
804  // save two bitwise instructions and one float-to-int instruction and
805  // increase one store instruction. There is potentially a more
806  // significant benefit because it avoids the float->int domain switch
807  // for input value. So It is more likely a win.
808  if ((LTy.isFloatingPoint() && HTy.isInteger()) ||
809  (LTy.isInteger() && HTy.isFloatingPoint()))
810  return true;
811  // If the pair only contains int values, we will save two bitwise
812  // instructions and increase one store instruction (costing one more
813  // store buffer). Since the benefit is more blurred so we leave
814  // such pair out until we get testcase to prove it is a win.
815  return false;
816  }
817 
818  bool isMaskAndCmp0FoldingBeneficial(const Instruction &AndI) const override;
819 
820  bool hasAndNotCompare(SDValue Y) const override;
821 
822  bool convertSetCCLogicToBitwiseLogic(EVT VT) const override {
823  return VT.isScalarInteger();
824  }
825 
826  /// Vector-sized comparisons are fast using PCMPEQ + PMOVMSK or PTEST.
827  MVT hasFastEqualityCompare(unsigned NumBits) const override;
828 
829  /// Allow multiple load pairs per block for smaller and faster code.
830  unsigned getMemcmpEqZeroLoadsPerBlock() const override {
831  return 2;
832  }
833 
834  /// Return the value type to use for ISD::SETCC.
835  EVT getSetCCResultType(const DataLayout &DL, LLVMContext &Context,
836  EVT VT) const override;
837 
838  bool targetShrinkDemandedConstant(SDValue Op, const APInt &Demanded,
839  TargetLoweringOpt &TLO) const override;
840 
841  /// Determine which of the bits specified in Mask are known to be either
842  /// zero or one and return them in the KnownZero/KnownOne bitsets.
843  void computeKnownBitsForTargetNode(const SDValue Op,
844  KnownBits &Known,
845  const APInt &DemandedElts,
846  const SelectionDAG &DAG,
847  unsigned Depth = 0) const override;
848 
849  /// Determine the number of bits in the operation that are sign bits.
850  unsigned ComputeNumSignBitsForTargetNode(SDValue Op,
851  const APInt &DemandedElts,
852  const SelectionDAG &DAG,
853  unsigned Depth) const override;
854 
855  SDValue unwrapAddress(SDValue N) const override;
856 
857  bool isGAPlusOffset(SDNode *N, const GlobalValue* &GA,
858  int64_t &Offset) const override;
859 
860  SDValue getReturnAddressFrameIndex(SelectionDAG &DAG) const;
861 
862  bool ExpandInlineAsm(CallInst *CI) const override;
863 
864  ConstraintType getConstraintType(StringRef Constraint) const override;
865 
866  /// Examine constraint string and operand type and determine a weight value.
867  /// The operand object must already have been set up with the operand type.
869  getSingleConstraintMatchWeight(AsmOperandInfo &info,
870  const char *constraint) const override;
871 
872  const char *LowerXConstraint(EVT ConstraintVT) const override;
873 
874  /// Lower the specified operand into the Ops vector. If it is invalid, don't
875  /// add anything to Ops. If hasMemory is true it means one of the asm
876  /// constraint of the inline asm instruction being processed is 'm'.
877  void LowerAsmOperandForConstraint(SDValue Op,
878  std::string &Constraint,
879  std::vector<SDValue> &Ops,
880  SelectionDAG &DAG) const override;
881 
882  unsigned
883  getInlineAsmMemConstraint(StringRef ConstraintCode) const override {
884  if (ConstraintCode == "i")
886  else if (ConstraintCode == "o")
888  else if (ConstraintCode == "v")
890  else if (ConstraintCode == "X")
892  return TargetLowering::getInlineAsmMemConstraint(ConstraintCode);
893  }
894 
895  /// Given a physical register constraint
896  /// (e.g. {edx}), return the register number and the register class for the
897  /// register. This should only be used for C_Register constraints. On
898  /// error, this returns a register number of 0.
899  std::pair<unsigned, const TargetRegisterClass *>
900  getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI,
901  StringRef Constraint, MVT VT) const override;
902 
903  /// Return true if the addressing mode represented
904  /// by AM is legal for this target, for a load/store of the specified type.
905  bool isLegalAddressingMode(const DataLayout &DL, const AddrMode &AM,
906  Type *Ty, unsigned AS,
907  Instruction *I = nullptr) const override;
908 
909  /// Return true if the specified immediate is legal
910  /// icmp immediate, that is the target has icmp instructions which can
911  /// compare a register against the immediate without having to materialize
912  /// the immediate into a register.
913  bool isLegalICmpImmediate(int64_t Imm) const override;
914 
915  /// Return true if the specified immediate is legal
916  /// add immediate, that is the target has add instructions which can
917  /// add a register and the immediate without having to materialize
918  /// the immediate into a register.
919  bool isLegalAddImmediate(int64_t Imm) const override;
920 
921  /// \brief Return the cost of the scaling factor used in the addressing
922  /// mode represented by AM for this target, for a load/store
923  /// of the specified type.
924  /// If the AM is supported, the return value must be >= 0.
925  /// If the AM is not supported, it returns a negative value.
926  int getScalingFactorCost(const DataLayout &DL, const AddrMode &AM, Type *Ty,
927  unsigned AS) const override;
928 
929  bool isVectorShiftByScalarCheap(Type *Ty) const override;
930 
931  /// Return true if it's free to truncate a value of
932  /// type Ty1 to type Ty2. e.g. On x86 it's free to truncate a i32 value in
933  /// register EAX to i16 by referencing its sub-register AX.
934  bool isTruncateFree(Type *Ty1, Type *Ty2) const override;
935  bool isTruncateFree(EVT VT1, EVT VT2) const override;
936 
937  bool allowTruncateForTailCall(Type *Ty1, Type *Ty2) const override;
938 
939  /// Return true if any actual instruction that defines a
940  /// value of type Ty1 implicit zero-extends the value to Ty2 in the result
941  /// register. This does not necessarily include registers defined in
942  /// unknown ways, such as incoming arguments, or copies from unknown
943  /// virtual registers. Also, if isTruncateFree(Ty2, Ty1) is true, this
944  /// does not necessarily apply to truncate instructions. e.g. on x86-64,
945  /// all instructions that define 32-bit values implicit zero-extend the
946  /// result out to 64 bits.
947  bool isZExtFree(Type *Ty1, Type *Ty2) const override;
948  bool isZExtFree(EVT VT1, EVT VT2) const override;
949  bool isZExtFree(SDValue Val, EVT VT2) const override;
950 
951  /// Return true if folding a vector load into ExtVal (a sign, zero, or any
952  /// extend node) is profitable.
953  bool isVectorLoadExtDesirable(SDValue) const override;
954 
955  /// Return true if an FMA operation is faster than a pair of fmul and fadd
956  /// instructions. fmuladd intrinsics will be expanded to FMAs when this
957  /// method returns true, otherwise fmuladd is expanded to fmul + fadd.
958  bool isFMAFasterThanFMulAndFAdd(EVT VT) const override;
959 
960  /// Return true if it's profitable to narrow
961  /// operations of type VT1 to VT2. e.g. on x86, it's profitable to narrow
962  /// from i32 to i8 but not from i32 to i16.
963  bool isNarrowingProfitable(EVT VT1, EVT VT2) const override;
964 
965  /// Given an intrinsic, checks if on the target the intrinsic will need to map
966  /// to a MemIntrinsicNode (touches memory). If this is the case, it returns
967  /// true and stores the intrinsic information into the IntrinsicInfo that was
968  /// passed to the function.
969  bool getTgtMemIntrinsic(IntrinsicInfo &Info, const CallInst &I,
970  MachineFunction &MF,
971  unsigned Intrinsic) const override;
972 
973  /// Returns true if the target can instruction select the
974  /// specified FP immediate natively. If false, the legalizer will
975  /// materialize the FP immediate as a load from a constant pool.
976  bool isFPImmLegal(const APFloat &Imm, EVT VT) const override;
977 
978  /// Targets can use this to indicate that they only support *some*
979  /// VECTOR_SHUFFLE operations, those with specific masks. By default, if a
980  /// target supports the VECTOR_SHUFFLE node, all mask values are assumed to
981  /// be legal.
982  bool isShuffleMaskLegal(ArrayRef<int> Mask, EVT VT) const override;
983 
984  /// Similar to isShuffleMaskLegal. This is used by Targets can use this to
985  /// indicate if there is a suitable VECTOR_SHUFFLE that can be used to
986  /// replace a VAND with a constant pool entry.
987  bool isVectorClearMaskLegal(const SmallVectorImpl<int> &Mask,
988  EVT VT) const override;
989 
990  /// If true, then instruction selection should
991  /// seek to shrink the FP constant of the specified type to a smaller type
992  /// in order to save space and / or reduce runtime.
993  bool ShouldShrinkFPConstant(EVT VT) const override {
994  // Don't shrink FP constpool if SSE2 is available since cvtss2sd is more
995  // expensive than a straight movsd. On the other hand, it's important to
996  // shrink long double fp constant since fldt is very slow.
997  return !X86ScalarSSEf64 || VT == MVT::f80;
998  }
999 
1000  /// Return true if we believe it is correct and profitable to reduce the
1001  /// load node to a smaller type.
1002  bool shouldReduceLoadWidth(SDNode *Load, ISD::LoadExtType ExtTy,
1003  EVT NewVT) const override;
1004 
1005  /// Return true if the specified scalar FP type is computed in an SSE
1006  /// register, not on the X87 floating point stack.
1007  bool isScalarFPTypeInSSEReg(EVT VT) const {
1008  return (VT == MVT::f64 && X86ScalarSSEf64) || // f64 is when SSE2
1009  (VT == MVT::f32 && X86ScalarSSEf32); // f32 is when SSE1
1010  }
1011 
1012  /// \brief Returns true if it is beneficial to convert a load of a constant
1013  /// to just the constant itself.
1014  bool shouldConvertConstantLoadToIntImm(const APInt &Imm,
1015  Type *Ty) const override;
1016 
1017  bool convertSelectOfConstantsToMath(EVT VT) const override;
1018 
1019  /// Return true if EXTRACT_SUBVECTOR is cheap for this result type
1020  /// with this index.
1021  bool isExtractSubvectorCheap(EVT ResVT, EVT SrcVT,
1022  unsigned Index) const override;
1023 
1024  bool storeOfVectorConstantIsCheap(EVT MemVT, unsigned NumElem,
1025  unsigned AddrSpace) const override {
1026  // If we can replace more than 2 scalar stores, there will be a reduction
1027  // in instructions even after we add a vector constant load.
1028  return NumElem > 2;
1029  }
1030 
1031  bool isLoadBitCastBeneficial(EVT LoadVT, EVT BitcastVT) const override;
1032 
1033  /// Intel processors have a unified instruction and data cache
1034  const char * getClearCacheBuiltinName() const override {
1035  return nullptr; // nothing to do, move along.
1036  }
1037 
1038  unsigned getRegisterByName(const char* RegName, EVT VT,
1039  SelectionDAG &DAG) const override;
1040 
1041  /// If a physical register, this returns the register that receives the
1042  /// exception address on entry to an EH pad.
1043  unsigned
1044  getExceptionPointerRegister(const Constant *PersonalityFn) const override;
1045 
1046  /// If a physical register, this returns the register that receives the
1047  /// exception typeid on entry to a landing pad.
1048  unsigned
1049  getExceptionSelectorRegister(const Constant *PersonalityFn) const override;
1050 
1051  virtual bool needsFixedCatchObjects() const override;
1052 
1053  /// This method returns a target specific FastISel object,
1054  /// or null if the target does not support "fast" ISel.
1056  const TargetLibraryInfo *libInfo) const override;
1057 
1058  /// If the target has a standard location for the stack protector cookie,
1059  /// returns the address of that location. Otherwise, returns nullptr.
1060  Value *getIRStackGuard(IRBuilder<> &IRB) const override;
1061 
1062  bool useLoadStackGuardNode() const override;
1063  bool useStackGuardXorFP() const override;
1064  void insertSSPDeclarations(Module &M) const override;
1065  Value *getSDagStackGuard(const Module &M) const override;
1066  Value *getSSPStackGuardCheck(const Module &M) const override;
1067  SDValue emitStackGuardXorFP(SelectionDAG &DAG, SDValue Val,
1068  const SDLoc &DL) const override;
1069 
1070 
1071  /// Return true if the target stores SafeStack pointer at a fixed offset in
1072  /// some non-standard address space, and populates the address space and
1073  /// offset as appropriate.
1074  Value *getSafeStackPointerLocation(IRBuilder<> &IRB) const override;
1075 
1076  SDValue BuildFILD(SDValue Op, EVT SrcVT, SDValue Chain, SDValue StackSlot,
1077  SelectionDAG &DAG) const;
1078 
1079  bool isNoopAddrSpaceCast(unsigned SrcAS, unsigned DestAS) const override;
1080 
1081  /// \brief Customize the preferred legalization strategy for certain types.
1082  LegalizeTypeAction getPreferredVectorAction(EVT VT) const override;
1083 
1084  bool isIntDivCheap(EVT VT, AttributeList Attr) const override;
1085 
1086  bool supportSwiftError() const override;
1087 
1088  StringRef getStackProbeSymbolName(MachineFunction &MF) const override;
1089 
1090  unsigned getMaxSupportedInterleaveFactor() const override { return 4; }
1091 
1092  /// \brief Lower interleaved load(s) into target specific
1093  /// instructions/intrinsics.
1094  bool lowerInterleavedLoad(LoadInst *LI,
1096  ArrayRef<unsigned> Indices,
1097  unsigned Factor) const override;
1098 
1099  /// \brief Lower interleaved store(s) into target specific
1100  /// instructions/intrinsics.
1101  bool lowerInterleavedStore(StoreInst *SI, ShuffleVectorInst *SVI,
1102  unsigned Factor) const override;
1103 
1104 
1105  void finalizeLowering(MachineFunction &MF) const override;
1106 
1107  protected:
1108  std::pair<const TargetRegisterClass *, uint8_t>
1109  findRepresentativeClass(const TargetRegisterInfo *TRI,
1110  MVT VT) const override;
1111 
1112  private:
1113  /// Keep a reference to the X86Subtarget around so that we can
1114  /// make the right decision when generating code for different targets.
1115  const X86Subtarget &Subtarget;
1116 
1117  /// Select between SSE or x87 floating point ops.
1118  /// When SSE is available, use it for f32 operations.
1119  /// When SSE2 is available, use it for f64 operations.
1120  bool X86ScalarSSEf32;
1121  bool X86ScalarSSEf64;
1122 
1123  /// A list of legal FP immediates.
1124  std::vector<APFloat> LegalFPImmediates;
1125 
1126  /// Indicate that this x86 target can instruction
1127  /// select the specified FP immediate natively.
1128  void addLegalFPImmediate(const APFloat& Imm) {
1129  LegalFPImmediates.push_back(Imm);
1130  }
1131 
1132  SDValue LowerCallResult(SDValue Chain, SDValue InFlag,
1133  CallingConv::ID CallConv, bool isVarArg,
1135  const SDLoc &dl, SelectionDAG &DAG,
1136  SmallVectorImpl<SDValue> &InVals,
1137  uint32_t *RegMask) const;
1138  SDValue LowerMemArgument(SDValue Chain, CallingConv::ID CallConv,
1139  const SmallVectorImpl<ISD::InputArg> &ArgInfo,
1140  const SDLoc &dl, SelectionDAG &DAG,
1141  const CCValAssign &VA, MachineFrameInfo &MFI,
1142  unsigned i) const;
1144  const SDLoc &dl, SelectionDAG &DAG,
1145  const CCValAssign &VA,
1146  ISD::ArgFlagsTy Flags) const;
1147 
1148  // Call lowering helpers.
1149 
1150  /// Check whether the call is eligible for tail call optimization. Targets
1151  /// that want to do tail call optimization should implement this function.
1152  bool IsEligibleForTailCallOptimization(SDValue Callee,
1153  CallingConv::ID CalleeCC,
1154  bool isVarArg,
1155  bool isCalleeStructRet,
1156  bool isCallerStructRet,
1157  Type *RetTy,
1158  const SmallVectorImpl<ISD::OutputArg> &Outs,
1159  const SmallVectorImpl<SDValue> &OutVals,
1160  const SmallVectorImpl<ISD::InputArg> &Ins,
1161  SelectionDAG& DAG) const;
1162  SDValue EmitTailCallLoadRetAddr(SelectionDAG &DAG, SDValue &OutRetAddr,
1163  SDValue Chain, bool IsTailCall,
1164  bool Is64Bit, int FPDiff,
1165  const SDLoc &dl) const;
1166 
1167  unsigned GetAlignedArgumentStackSize(unsigned StackSize,
1168  SelectionDAG &DAG) const;
1169 
1170  unsigned getAddressSpace(void) const;
1171 
1172  std::pair<SDValue,SDValue> FP_TO_INTHelper(SDValue Op, SelectionDAG &DAG,
1173  bool isSigned,
1174  bool isReplace) const;
1175 
1176  SDValue LowerBUILD_VECTOR(SDValue Op, SelectionDAG &DAG) const;
1177  SDValue LowerVSELECT(SDValue Op, SelectionDAG &DAG) const;
1180 
1181  unsigned getGlobalWrapperKind(const GlobalValue *GV = nullptr) const;
1182  SDValue LowerConstantPool(SDValue Op, SelectionDAG &DAG) const;
1183  SDValue LowerBlockAddress(SDValue Op, SelectionDAG &DAG) const;
1184  SDValue LowerGlobalAddress(const GlobalValue *GV, const SDLoc &dl,
1185  int64_t Offset, SelectionDAG &DAG) const;
1186  SDValue LowerGlobalAddress(SDValue Op, SelectionDAG &DAG) const;
1187  SDValue LowerGlobalTLSAddress(SDValue Op, SelectionDAG &DAG) const;
1188  SDValue LowerExternalSymbol(SDValue Op, SelectionDAG &DAG) const;
1189 
1190  SDValue LowerSINT_TO_FP(SDValue Op, SelectionDAG &DAG) const;
1191  SDValue LowerUINT_TO_FP(SDValue Op, SelectionDAG &DAG) const;
1192  SDValue LowerTRUNCATE(SDValue Op, SelectionDAG &DAG) const;
1193  SDValue LowerFP_TO_INT(SDValue Op, SelectionDAG &DAG) const;
1194  SDValue LowerSETCC(SDValue Op, SelectionDAG &DAG) const;
1195  SDValue LowerSETCCCARRY(SDValue Op, SelectionDAG &DAG) const;
1196  SDValue LowerSELECT(SDValue Op, SelectionDAG &DAG) const;
1197  SDValue LowerBRCOND(SDValue Op, SelectionDAG &DAG) const;
1198  SDValue LowerJumpTable(SDValue Op, SelectionDAG &DAG) const;
1200  SDValue LowerVASTART(SDValue Op, SelectionDAG &DAG) const;
1201  SDValue LowerVAARG(SDValue Op, SelectionDAG &DAG) const;
1202  SDValue LowerRETURNADDR(SDValue Op, SelectionDAG &DAG) const;
1203  SDValue LowerADDROFRETURNADDR(SDValue Op, SelectionDAG &DAG) const;
1204  SDValue LowerFRAMEADDR(SDValue Op, SelectionDAG &DAG) const;
1205  SDValue LowerFRAME_TO_ARGS_OFFSET(SDValue Op, SelectionDAG &DAG) const;
1206  SDValue LowerEH_RETURN(SDValue Op, SelectionDAG &DAG) const;
1207  SDValue lowerEH_SJLJ_SETJMP(SDValue Op, SelectionDAG &DAG) const;
1208  SDValue lowerEH_SJLJ_LONGJMP(SDValue Op, SelectionDAG &DAG) const;
1209  SDValue lowerEH_SJLJ_SETUP_DISPATCH(SDValue Op, SelectionDAG &DAG) const;
1210  SDValue LowerINIT_TRAMPOLINE(SDValue Op, SelectionDAG &DAG) const;
1211  SDValue LowerFLT_ROUNDS_(SDValue Op, SelectionDAG &DAG) const;
1212  SDValue LowerWin64_i128OP(SDValue Op, SelectionDAG &DAG) const;
1213  SDValue LowerGC_TRANSITION_START(SDValue Op, SelectionDAG &DAG) const;
1214  SDValue LowerGC_TRANSITION_END(SDValue Op, SelectionDAG &DAG) const;
1215  SDValue LowerINTRINSIC_WO_CHAIN(SDValue Op, SelectionDAG &DAG) const;
1216 
1217  SDValue
1218  LowerFormalArguments(SDValue Chain, CallingConv::ID CallConv, bool isVarArg,
1219  const SmallVectorImpl<ISD::InputArg> &Ins,
1220  const SDLoc &dl, SelectionDAG &DAG,
1221  SmallVectorImpl<SDValue> &InVals) const override;
1222  SDValue LowerCall(CallLoweringInfo &CLI,
1223  SmallVectorImpl<SDValue> &InVals) const override;
1224 
1225  SDValue LowerReturn(SDValue Chain, CallingConv::ID CallConv, bool isVarArg,
1226  const SmallVectorImpl<ISD::OutputArg> &Outs,
1227  const SmallVectorImpl<SDValue> &OutVals,
1228  const SDLoc &dl, SelectionDAG &DAG) const override;
1229 
1230  bool supportSplitCSR(MachineFunction *MF) const override {
1232  MF->getFunction().hasFnAttribute(Attribute::NoUnwind);
1233  }
1234  void initializeSplitCSR(MachineBasicBlock *Entry) const override;
1235  void insertCopiesSplitCSR(
1236  MachineBasicBlock *Entry,
1237  const SmallVectorImpl<MachineBasicBlock *> &Exits) const override;
1238 
1239  bool isUsedByReturnOnly(SDNode *N, SDValue &Chain) const override;
1240 
1241  bool mayBeEmittedAsTailCall(const CallInst *CI) const override;
1242 
1243  EVT getTypeForExtReturn(LLVMContext &Context, EVT VT,
1244  ISD::NodeType ExtendKind) const override;
1245 
1246  bool CanLowerReturn(CallingConv::ID CallConv, MachineFunction &MF,
1247  bool isVarArg,
1248  const SmallVectorImpl<ISD::OutputArg> &Outs,
1249  LLVMContext &Context) const override;
1250 
1251  const MCPhysReg *getScratchRegisters(CallingConv::ID CC) const override;
1252 
1254  shouldExpandAtomicLoadInIR(LoadInst *SI) const override;
1255  bool shouldExpandAtomicStoreInIR(StoreInst *SI) const override;
1257  shouldExpandAtomicRMWInIR(AtomicRMWInst *AI) const override;
1258 
1259  LoadInst *
1260  lowerIdempotentRMWIntoFencedLoad(AtomicRMWInst *AI) const override;
1261 
1262  bool needsCmpXchgNb(Type *MemType) const;
1263 
1264  void SetupEntryBlockForSjLj(MachineInstr &MI, MachineBasicBlock *MBB,
1265  MachineBasicBlock *DispatchBB, int FI) const;
1266 
1267  // Utility function to emit the low-level va_arg code for X86-64.
1269  EmitVAARG64WithCustomInserter(MachineInstr &MI,
1270  MachineBasicBlock *MBB) const;
1271 
1272  /// Utility function to emit the xmm reg save portion of va_start.
1274  EmitVAStartSaveXMMRegsWithCustomInserter(MachineInstr &BInstr,
1275  MachineBasicBlock *BB) const;
1276 
1277  MachineBasicBlock *EmitLoweredCascadedSelect(MachineInstr &MI1,
1278  MachineInstr &MI2,
1279  MachineBasicBlock *BB) const;
1280 
1281  MachineBasicBlock *EmitLoweredSelect(MachineInstr &I,
1282  MachineBasicBlock *BB) const;
1283 
1284  MachineBasicBlock *EmitLoweredAtomicFP(MachineInstr &I,
1285  MachineBasicBlock *BB) const;
1286 
1287  MachineBasicBlock *EmitLoweredCatchRet(MachineInstr &MI,
1288  MachineBasicBlock *BB) const;
1289 
1290  MachineBasicBlock *EmitLoweredCatchPad(MachineInstr &MI,
1291  MachineBasicBlock *BB) const;
1292 
1293  MachineBasicBlock *EmitLoweredSegAlloca(MachineInstr &MI,
1294  MachineBasicBlock *BB) const;
1295 
1296  MachineBasicBlock *EmitLoweredTLSAddr(MachineInstr &MI,
1297  MachineBasicBlock *BB) const;
1298 
1299  MachineBasicBlock *EmitLoweredTLSCall(MachineInstr &MI,
1300  MachineBasicBlock *BB) const;
1301 
1302  MachineBasicBlock *emitEHSjLjSetJmp(MachineInstr &MI,
1303  MachineBasicBlock *MBB) const;
1304 
1305  MachineBasicBlock *emitEHSjLjLongJmp(MachineInstr &MI,
1306  MachineBasicBlock *MBB) const;
1307 
1308  MachineBasicBlock *emitFMA3Instr(MachineInstr &MI,
1309  MachineBasicBlock *MBB) const;
1310 
1311  MachineBasicBlock *EmitSjLjDispatchBlock(MachineInstr &MI,
1312  MachineBasicBlock *MBB) const;
1313 
1314  /// Emit nodes that will be selected as "test Op0,Op0", or something
1315  /// equivalent, for use with the given x86 condition code.
1316  SDValue EmitTest(SDValue Op0, unsigned X86CC, const SDLoc &dl,
1317  SelectionDAG &DAG) const;
1318 
1319  /// Emit nodes that will be selected as "cmp Op0,Op1", or something
1320  /// equivalent, for use with the given x86 condition code.
1321  SDValue EmitCmp(SDValue Op0, SDValue Op1, unsigned X86CC, const SDLoc &dl,
1322  SelectionDAG &DAG) const;
1323 
1324  /// Convert a comparison if required by the subtarget.
1325  SDValue ConvertCmpIfNecessary(SDValue Cmp, SelectionDAG &DAG) const;
1326 
1327  /// Check if replacement of SQRT with RSQRT should be disabled.
1328  bool isFsqrtCheap(SDValue Operand, SelectionDAG &DAG) const override;
1329 
1330  /// Use rsqrt* to speed up sqrt calculations.
1331  SDValue getSqrtEstimate(SDValue Operand, SelectionDAG &DAG, int Enabled,
1332  int &RefinementSteps, bool &UseOneConstNR,
1333  bool Reciprocal) const override;
1334 
1335  /// Use rcp* to speed up fdiv calculations.
1336  SDValue getRecipEstimate(SDValue Operand, SelectionDAG &DAG, int Enabled,
1337  int &RefinementSteps) const override;
1338 
1339  /// Reassociate floating point divisions into multiply by reciprocal.
1340  unsigned combineRepeatedFPDivisors() const override;
1341  };
1342 
1343  namespace X86 {
1345  const TargetLibraryInfo *libInfo);
1346  } // end namespace X86
1347 
1348  // Base class for all X86 non-masked store operations.
1349  class X86StoreSDNode : public MemSDNode {
1350  public:
1351  X86StoreSDNode(unsigned Opcode, unsigned Order, const DebugLoc &dl,
1352  SDVTList VTs, EVT MemVT,
1353  MachineMemOperand *MMO)
1354  :MemSDNode(Opcode, Order, dl, VTs, MemVT, MMO) {}
1355  const SDValue &getValue() const { return getOperand(1); }
1356  const SDValue &getBasePtr() const { return getOperand(2); }
1357 
1358  static bool classof(const SDNode *N) {
1359  return N->getOpcode() == X86ISD::VTRUNCSTORES ||
1361  }
1362  };
1363 
1364  // Base class for all X86 masked store operations.
1365  // The class has the same order of operands as MaskedStoreSDNode for
1366  // convenience.
1368  public:
1369  X86MaskedStoreSDNode(unsigned Opcode, unsigned Order,
1370  const DebugLoc &dl, SDVTList VTs, EVT MemVT,
1371  MachineMemOperand *MMO)
1372  : MemSDNode(Opcode, Order, dl, VTs, MemVT, MMO) {}
1373 
1374  const SDValue &getBasePtr() const { return getOperand(1); }
1375  const SDValue &getMask() const { return getOperand(2); }
1376  const SDValue &getValue() const { return getOperand(3); }
1377 
1378  static bool classof(const SDNode *N) {
1379  return N->getOpcode() == X86ISD::VMTRUNCSTORES ||
1381  }
1382  };
1383 
1384  // X86 Truncating Store with Signed saturation.
1386  public:
1387  TruncSStoreSDNode(unsigned Order, const DebugLoc &dl,
1388  SDVTList VTs, EVT MemVT, MachineMemOperand *MMO)
1389  : X86StoreSDNode(X86ISD::VTRUNCSTORES, Order, dl, VTs, MemVT, MMO) {}
1390 
1391  static bool classof(const SDNode *N) {
1392  return N->getOpcode() == X86ISD::VTRUNCSTORES;
1393  }
1394  };
1395 
1396  // X86 Truncating Store with Unsigned saturation.
1398  public:
1399  TruncUSStoreSDNode(unsigned Order, const DebugLoc &dl,
1400  SDVTList VTs, EVT MemVT, MachineMemOperand *MMO)
1401  : X86StoreSDNode(X86ISD::VTRUNCSTOREUS, Order, dl, VTs, MemVT, MMO) {}
1402 
1403  static bool classof(const SDNode *N) {
1404  return N->getOpcode() == X86ISD::VTRUNCSTOREUS;
1405  }
1406  };
1407 
1408  // X86 Truncating Masked Store with Signed saturation.
1410  public:
1411  MaskedTruncSStoreSDNode(unsigned Order,
1412  const DebugLoc &dl, SDVTList VTs, EVT MemVT,
1413  MachineMemOperand *MMO)
1414  : X86MaskedStoreSDNode(X86ISD::VMTRUNCSTORES, Order, dl, VTs, MemVT, MMO) {}
1415 
1416  static bool classof(const SDNode *N) {
1417  return N->getOpcode() == X86ISD::VMTRUNCSTORES;
1418  }
1419  };
1420 
1421  // X86 Truncating Masked Store with Unsigned saturation.
1423  public:
1425  const DebugLoc &dl, SDVTList VTs, EVT MemVT,
1426  MachineMemOperand *MMO)
1427  : X86MaskedStoreSDNode(X86ISD::VMTRUNCSTOREUS, Order, dl, VTs, MemVT, MMO) {}
1428 
1429  static bool classof(const SDNode *N) {
1430  return N->getOpcode() == X86ISD::VMTRUNCSTOREUS;
1431  }
1432  };
1433 
1434  // X86 specific Gather/Scatter nodes.
1435  // The class has the same order of operands as MaskedGatherScatterSDNode for
1436  // convenience.
1438  public:
1439  X86MaskedGatherScatterSDNode(unsigned Opc, unsigned Order,
1440  const DebugLoc &dl, SDVTList VTs, EVT MemVT,
1441  MachineMemOperand *MMO)
1442  : MemSDNode(Opc, Order, dl, VTs, MemVT, MMO) {}
1443 
1444  const SDValue &getBasePtr() const { return getOperand(3); }
1445  const SDValue &getIndex() const { return getOperand(4); }
1446  const SDValue &getMask() const { return getOperand(2); }
1447  const SDValue &getValue() const { return getOperand(1); }
1448  const SDValue &getScale() const { return getOperand(5); }
1449 
1450  static bool classof(const SDNode *N) {
1451  return N->getOpcode() == X86ISD::MGATHER ||
1452  N->getOpcode() == X86ISD::MSCATTER;
1453  }
1454  };
1455 
1457  public:
1458  X86MaskedGatherSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs,
1459  EVT MemVT, MachineMemOperand *MMO)
1460  : X86MaskedGatherScatterSDNode(X86ISD::MGATHER, Order, dl, VTs, MemVT,
1461  MMO) {}
1462 
1463  static bool classof(const SDNode *N) {
1464  return N->getOpcode() == X86ISD::MGATHER;
1465  }
1466  };
1467 
1469  public:
1470  X86MaskedScatterSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs,
1471  EVT MemVT, MachineMemOperand *MMO)
1472  : X86MaskedGatherScatterSDNode(X86ISD::MSCATTER, Order, dl, VTs, MemVT,
1473  MMO) {}
1474 
1475  static bool classof(const SDNode *N) {
1476  return N->getOpcode() == X86ISD::MSCATTER;
1477  }
1478  };
1479 
1480  /// Generate unpacklo/unpackhi shuffle mask.
1481  template <typename T = int>
1483  bool Unary) {
1484  assert(Mask.empty() && "Expected an empty shuffle mask vector");
1485  int NumElts = VT.getVectorNumElements();
1486  int NumEltsInLane = 128 / VT.getScalarSizeInBits();
1487  for (int i = 0; i < NumElts; ++i) {
1488  unsigned LaneStart = (i / NumEltsInLane) * NumEltsInLane;
1489  int Pos = (i % NumEltsInLane) / 2 + LaneStart;
1490  Pos += (Unary ? 0 : NumElts * (i % 2));
1491  Pos += (Lo ? 0 : NumEltsInLane / 2);
1492  Mask.push_back(Pos);
1493  }
1494  }
1495 
1496  /// Helper function to scale a shuffle or target shuffle mask, replacing each
1497  /// mask index with the scaled sequential indices for an equivalent narrowed
1498  /// mask. This is the reverse process to canWidenShuffleElements, but can
1499  /// always succeed.
1500  template <typename T>
1502  SmallVectorImpl<T> &ScaledMask) {
1503  assert(0 < Scale && "Unexpected scaling factor");
1504  int NumElts = Mask.size();
1505  ScaledMask.assign(static_cast<size_t>(NumElts * Scale), -1);
1506 
1507  for (int i = 0; i != NumElts; ++i) {
1508  int M = Mask[i];
1509 
1510  // Repeat sentinel values in every mask element.
1511  if (M < 0) {
1512  for (int s = 0; s != Scale; ++s)
1513  ScaledMask[(Scale * i) + s] = M;
1514  continue;
1515  }
1516 
1517  // Scale mask element and increment across each mask element.
1518  for (int s = 0; s != Scale; ++s)
1519  ScaledMask[(Scale * i) + s] = (Scale * M) + s;
1520  }
1521  }
1522 } // end namespace llvm
1523 
1524 #endif // LLVM_LIB_TARGET_X86_X86ISELLOWERING_H
static SDValue LowerEXTRACT_VECTOR_ELT(SDValue Op, SelectionDAG &DAG)
const SDValue & getIndex() const
Double shift instructions.
static SDValue LowerCallResult(SDValue Chain, SDValue InFlag, const SmallVectorImpl< CCValAssign > &RVLocs, const SDLoc &dl, SelectionDAG &DAG, SmallVectorImpl< SDValue > &InVals)
LowerCallResult - Lower the result values of a call into the appropriate copies out of appropriate ph...
TruncUSStoreSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs, EVT MemVT, MachineMemOperand *MMO)
BUILTIN_OP_END - This must be the last enum value in this list.
Definition: ISDOpcodes.h:835
A parsed version of the target data layout string in and methods for querying it. ...
Definition: DataLayout.h:109
constexpr char Align[]
Key for Kernel::Arg::Metadata::mAlign.
This represents an addressing mode of: BaseGV + BaseOffs + BaseReg + Scale*ScaleReg If BaseGV is null...
Vector comparison generating mask bits for fp and integer signed and unsigned data types...
Repeat move, corresponds to X86::REP_MOVSx.
void createUnpackShuffleMask(MVT VT, SmallVectorImpl< T > &Mask, bool Lo, bool Unary)
Generate unpacklo/unpackhi shuffle mask.
unsigned getOpcode() const
Return the SelectionDAG opcode value for this node.
LLVMContext & Context
static bool classof(const SDNode *N)
Return with a flag operand.
const SDValue & getBasePtr() const
Compute iterated dominance frontiers using a linear time algorithm.
Definition: AllocatorList.h:24
const SDValue & getScale() const
Tail call return.
Compute Double Block Packed Sum-Absolute-Differences.
A Module instance is used to store all the information related to an LLVM module. ...
Definition: Module.h:63
bool mergeStoresAfterLegalization() const override
Allow store merging after legalization in addition to before legalization.
static bool classof(const SDNode *N)
static void LowerMemOpCallTo(SelectionDAG &DAG, MachineFunction &MF, SDValue Chain, SDValue Arg, SDValue PtrOff, int SPDiff, unsigned ArgOffset, bool isPPC64, bool isTailCall, bool isVector, SmallVectorImpl< SDValue > &MemOpChains, SmallVectorImpl< TailCallArgumentInfo > &TailCallArguments, const SDLoc &dl)
LowerMemOpCallTo - Store the argument to the stack or remember it in case of tail calls...
bool isMultiStoresCheaperThanBitsMerge(EVT LTy, EVT HTy) const override
Return true if it is cheaper to split the store of a merged int val from a pair of smaller values int...
X86 conditional moves.
bool isScalarInteger() const
Return true if this is an integer, but not a vector.
Definition: ValueTypes.h:146
This class represents a function call, abstracting a target machine&#39;s calling convention.
unsigned getVectorNumElements() const
Function Alias Analysis Results
This instruction constructs a fixed permutation of two input vectors.
const SDValue & getValue() const
bool hasFnAttribute(Attribute::AttrKind Kind) const
Return true if the function has the attribute.
Definition: Function.h:302
A debug info location.
Definition: DebugLoc.h:34
bool isInteger() const
Return true if this is an integer or a vector integer type.
Definition: ValueTypes.h:141
static bool classof(const SDNode *N)
SSE4A Extraction and Insertion.
static bool classof(const SDNode *N)
An instruction for reading from memory.
Definition: Instructions.h:164
an instruction that atomically reads a memory location, combines it with another value, and then stores the result back.
Definition: Instructions.h:677
Bitwise logical ANDNOT of floating point values.
AtomicExpansionKind
Enum that specifies what an atomic load/AtomicRMWInst is expanded to, if at all.
bool isCalleePop(CallingConv::ID CallingConv, bool is64Bit, bool IsVarArg, bool GuaranteeTCO)
Determines whether the callee is required to pop its own arguments.
This operation implements the lowering for readcyclecounter.
NodeType
ISD::NodeType enum - This enum defines the target-independent operators for a SelectionDAG.
Definition: ISDOpcodes.h:39
X86MaskedGatherScatterSDNode(unsigned Opc, unsigned Order, const DebugLoc &dl, SDVTList VTs, EVT MemVT, MachineMemOperand *MMO)
A convenience struct that encapsulates a DAG, and two SDValues for returning information from TargetL...
X86 compare and logical compare instructions.
MaskedTruncUSStoreSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs, EVT MemVT, MachineMemOperand *MMO)
bool isFloatingPoint() const
Return true if this is a FP or a vector FP type.
Definition: ValueTypes.h:136
static GCMetadataPrinterRegistry::Add< OcamlGCMetadataPrinter > Y("ocaml", "ocaml 3.10-compatible collector")
Extract an 8-bit value from a vector and zero extend it to i32, corresponds to X86::PEXTRB.
A description of a memory reference used in the backend.
X86StoreSDNode(unsigned Opcode, unsigned Order, const DebugLoc &dl, SDVTList VTs, EVT MemVT, MachineMemOperand *MMO)
Bitwise Logical AND NOT of Packed FP values.
This class consists of common code factored out of the SmallVector class to reduce code duplication b...
Definition: APFloat.h:42
Base class for the full range of assembler expressions which are needed for parsing.
Definition: MCExpr.h:36
This provides a uniform API for creating instructions and inserting them into a basic block: either a...
Definition: IRBuilder.h:668
This instruction implements SINT_TO_FP with the integer source in memory and FP reg result...
static SDValue LowerRETURNADDR(SDValue Op, SelectionDAG &DAG, const SparcTargetLowering &TLI, const SparcSubtarget *Subtarget)
The MachineFrameInfo class represents an abstract stack frame until prolog/epilog code is inserted...
This class defines information used to lower LLVM code to legal SelectionDAG operators that the targe...
Integer horizontal add/sub.
This represents a list of ValueType&#39;s that has been intern&#39;d by a SelectionDAG.
Copies a 64-bit value from the low word of an XMM vector to an MMX vector.
void assign(size_type NumElts, const T &Elt)
Definition: SmallVector.h:425
Context object for machine code objects.
Definition: MCContext.h:60
static SDValue LowerINSERT_VECTOR_ELT(SDValue Op, SelectionDAG &DAG)
Copies a 32-bit value from the low word of a MMX vector to a GPR.
This is a fast-path instruction selection class that generates poor code and doesn&#39;t support illegal ...
Definition: FastISel.h:67
X86 FP SETCC, similar to above, but with output as an i1 mask and with optional rounding mode...
Return from interrupt. Operand 0 is the number of bytes to pop.
This contains information for each constraint that we are lowering.
ArrayRef - Represent a constant reference to an array (0 or more elements consecutively in memory)...
Definition: APInt.h:33
An instruction for storing to memory.
Definition: Instructions.h:306
static bool classof(const SDNode *N)
X86 FP SETCC, implemented with CMP{cc}SS/CMP{cc}SD.
const SDValue & getBasePtr() const
virtual unsigned getInlineAsmMemConstraint(StringRef ConstraintCode) const
Floating point horizontal add/sub.
unsigned getInlineAsmMemConstraint(StringRef ConstraintCode) const override
amdgpu Simplify well known AMD library false Value * Callee
Bitwise logical XOR of floating point values.
uint16_t MCPhysReg
An unsigned integer type large enough to represent all physical registers, but not necessarily virtua...
static bool classof(const SDNode *N)
const SDValue & getMask() const
Machine Value Type.
static SDValue LowerVASTART(SDValue Op, SelectionDAG &DAG)
The instances of the Type class are immutable: once they are created, they are never changed...
Definition: Type.h:46
This instruction implements an extending load to FP stack slots.
This is an important class for using LLVM in a threaded context.
Definition: LLVMContext.h:69
Insert any element of a 4 x float vector into any element of a destination 4 x floatvector.
unsigned getScalarSizeInBits() const
size_t size() const
size - Get the array size.
Definition: ArrayRef.h:149
This is an important base class in LLVM.
Definition: Constant.h:42
Repeat fill, corresponds to X86::REP_STOSx.
static bool is64Bit(const char *name)
LoadExtType
LoadExtType enum - This enum defines the three variants of LOADEXT (load with extension).
Definition: ISDOpcodes.h:892
bool storeOfVectorConstantIsCheap(EVT MemVT, unsigned NumElem, unsigned AddrSpace) const override
Return true if it is expected to be cheaper to do a store of a non-zero vector constant with the give...
X86 conditional branches.
Insert the lower 16-bits of a 32-bit value to a vector, corresponds to X86::PINSRW.
Fast - This calling convention attempts to make calls as fast as possible (e.g.
Definition: CallingConv.h:43
Commutative FMIN and FMAX.
static SDValue LowerFRAMEADDR(SDValue Op, SelectionDAG &DAG, const SparcSubtarget *Subtarget)
On Darwin, this node represents the result of the popl at function entry, used for PIC code...
bool convertSetCCLogicToBitwiseLogic(EVT VT) const override
Use bitwise logic to make pairs of compares more efficient.
static SDValue LowerUINT_TO_FP(SDValue Op, SelectionDAG &DAG, const SparcTargetLowering &TLI, bool hasHardQuad)
const SDValue & getValue() const
static SDValue LowerDYNAMIC_STACKALLOC(SDValue Op, SelectionDAG &DAG, const SparcSubtarget *Subtarget)
lazy value info
std::vector< ArgListEntry > ArgListTy
Extended Value Type.
Definition: ValueTypes.h:34
const AMDGPUAS & AS
TargetRegisterInfo base class - We assume that the target defines a static array of TargetRegisterDes...
This structure contains all information that is necessary for lowering calls.
These operations represent an abstract X86 call instruction, which includes a bunch of information...
Floating point max and min.
TruncSStoreSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs, EVT MemVT, MachineMemOperand *MMO)
bool isScalarFPTypeInSSEReg(EVT VT) const
Return true if the specified scalar FP type is computed in an SSE register, not on the X87 floating p...
Copies a GPR into the low 32-bit word of a MMX vector and zero out the high word. ...
CallingConv::ID getCallingConv() const
getCallingConv()/setCallingConv(CC) - These method get and set the calling convention of this functio...
Definition: Function.h:194
This is used to represent a portion of an LLVM function in a low-level Data Dependence DAG representa...
Definition: SelectionDAG.h:210
unsigned getMaxSupportedInterleaveFactor() const override
Get the maximum supported factor for interleaved memory accesses.
Provides information about what library functions are available for the current target.
X86 Read Time-Stamp Counter and Processor ID.
CCValAssign - Represent assignment of one arg/retval to a location.
AddressSpace
Definition: NVPTXBaseInfo.h:22
unsigned getMemcmpEqZeroLoadsPerBlock() const override
Allow multiple load pairs per block for smaller and faster code.
X86MaskedScatterSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs, EVT MemVT, MachineMemOperand *MMO)
This is an abstract virtual class for memory operations.
Wrapper class for IR location info (IR ordering and DebugLoc) to be passed into SDNode creation funct...
Bit scan reverse.
Floating point reciprocal-sqrt and reciprocal approximation.
static const int FIRST_TARGET_MEMORY_OPCODE
FIRST_TARGET_MEMORY_OPCODE - Target-specific pre-isel operations which do not reference a specific me...
Definition: ISDOpcodes.h:842
const SDValue & getValue() const
Represents one node in the SelectionDAG.
X86 bit-test instructions.
static bool Enabled
Definition: Statistic.cpp:51
const Function & getFunction() const
Return the LLVM function that this machine code represents.
static bool classof(const SDNode *N)
MaskedTruncSStoreSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs, EVT MemVT, MachineMemOperand *MMO)
Class for arbitrary precision integers.
Definition: APInt.h:69
static SDValue LowerVAARG(SDValue Op, SelectionDAG &DAG)
static bool classof(const SDNode *N)
This instruction implements FP_TO_SINT with the integer destination in memory and a FP reg source...
Bit scan forward.
const char * getClearCacheBuiltinName() const override
Intel processors have a unified instruction and data cache.
amdgpu Simplify well known AMD library false Value Value * Arg
bool ShouldShrinkFPConstant(EVT VT) const override
If true, then instruction selection should seek to shrink the FP constant of the specified type to a ...
Representation of each machine instruction.
Definition: MachineInstr.h:60
static unsigned getScalingFactorCost(const TargetTransformInfo &TTI, const LSRUse &LU, const Formula &F, const Loop &L)
bool isVector() const
Return true if this is a vector value type.
Definition: ValueTypes.h:151
Insert the lower 8-bits of a 32-bit value to a vector, corresponds to X86::PINSRB.
LLVM_NODISCARD bool empty() const
Definition: SmallVector.h:61
A wrapper node for TargetConstantPool, TargetJumpTable, TargetExternalSymbol, TargetGlobalAddress, TargetGlobalTLSAddress, MCSymbol and TargetBlockAddress.
Bitwise logical AND of floating point values.
#define I(x, y, z)
Definition: MD5.cpp:58
#define N
FunctionLoweringInfo - This contains information that is global to a function that is used when lower...
static bool classof(const SDNode *N)
X86MaskedStoreSDNode(unsigned Opcode, unsigned Order, const DebugLoc &dl, SDVTList VTs, EVT MemVT, MachineMemOperand *MMO)
LOCK-prefixed arithmetic read-modify-write instructions.
Extract a 16-bit value from a vector and zero extend it to i32, corresponds to X86::PEXTRW.
Blend where the selector is an immediate.
X86MaskedGatherSDNode(unsigned Order, const DebugLoc &dl, SDVTList VTs, EVT MemVT, MachineMemOperand *MMO)
This instruction implements a truncating store to FP stack slots.
Combined add and sub on an FP vector.
assert(ImpDefSCC.getReg()==AMDGPU::SCC &&ImpDefSCC.isDef())
LegalizeTypeAction
This enum indicates whether a types are legal for a target, and if not, what action should be used to...
This instruction grabs the address of the next argument from a va_list.
LLVM Value Representation.
Definition: Value.h:73
Bitwise logical OR of floating point values.
Dynamic (non-constant condition) vector blend where only the sign bits of the condition elements are ...
X86 Read Performance Monitoring Counters.
constexpr char Size[]
Key for Kernel::Arg::Metadata::mSize.
const SDValue & getBasePtr() const
std::underlying_type< E >::type Mask()
Get a bitmask with 1s in all places up to the high-order bit of E&#39;s largest value.
Definition: BitmaskEnum.h:81
static SDValue LowerSINT_TO_FP(SDValue Op, SelectionDAG &DAG, const SparcTargetLowering &TLI, bool hasHardQuad)
IRTranslator LLVM IR MI
bool isZeroNode(SDValue Elt)
Returns true if Elt is a constant zero or floating point constant +0.0.
StringRef - Represent a constant reference to a string, i.e.
Definition: StringRef.h:49
bool isOffsetSuitableForCodeModel(int64_t Offset, CodeModel::Model M, bool hasSymbolicDisplacement=true)
Returns true of the given offset can be fit into displacement field of the instruction.
bool hasBitPreservingFPLogic(EVT VT) const override
Return true if it is safe to transform an integer-domain bitwise operation into the equivalent floati...
Compute Sum of Absolute Differences.
Scalar intrinsic floating point max and min.
MVT getScalarShiftAmountTy(const DataLayout &, EVT VT) const override
EVT is not used in-tree, but is used by out-of-tree target.
Unlike LLVM values, Selection DAG nodes may return multiple values as the result of a computation...
void scaleShuffleMask(int Scale, ArrayRef< T > Mask, SmallVectorImpl< T > &ScaledMask)
Helper function to scale a shuffle or target shuffle mask, replacing each mask index with the scaled ...
constexpr char Args[]
Key for Kernel::Metadata::mArgs.
FastISel * createFastISel(FunctionLoweringInfo &funcInfo, const TargetLibraryInfo *libInfo)
Shuffle 16 8-bit values within a vector.
This file describes how to lower LLVM code to machine code.
Special wrapper used under X86-64 PIC mode for RIP relative displacements.