LLVM 19.0.0git
AArch64FrameLowering.cpp
Go to the documentation of this file.
1//===- AArch64FrameLowering.cpp - AArch64 Frame Lowering -------*- C++ -*-====//
2//
3// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4// See https://llvm.org/LICENSE.txt for license information.
5// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6//
7//===----------------------------------------------------------------------===//
8//
9// This file contains the AArch64 implementation of TargetFrameLowering class.
10//
11// On AArch64, stack frames are structured as follows:
12//
13// The stack grows downward.
14//
15// All of the individual frame areas on the frame below are optional, i.e. it's
16// possible to create a function so that the particular area isn't present
17// in the frame.
18//
19// At function entry, the "frame" looks as follows:
20//
21// | | Higher address
22// |-----------------------------------|
23// | |
24// | arguments passed on the stack |
25// | |
26// |-----------------------------------| <- sp
27// | | Lower address
28//
29//
30// After the prologue has run, the frame has the following general structure.
31// Note that this doesn't depict the case where a red-zone is used. Also,
32// technically the last frame area (VLAs) doesn't get created until in the
33// main function body, after the prologue is run. However, it's depicted here
34// for completeness.
35//
36// | | Higher address
37// |-----------------------------------|
38// | |
39// | arguments passed on the stack |
40// | |
41// |-----------------------------------|
42// | |
43// | (Win64 only) varargs from reg |
44// | |
45// |-----------------------------------|
46// | |
47// | callee-saved gpr registers | <--.
48// | | | On Darwin platforms these
49// |- - - - - - - - - - - - - - - - - -| | callee saves are swapped,
50// | prev_lr | | (frame record first)
51// | prev_fp | <--'
52// | async context if needed |
53// | (a.k.a. "frame record") |
54// |-----------------------------------| <- fp(=x29)
55// | |
56// | callee-saved fp/simd/SVE regs |
57// | |
58// |-----------------------------------|
59// | |
60// | SVE stack objects |
61// | |
62// |-----------------------------------|
63// |.empty.space.to.make.part.below....|
64// |.aligned.in.case.it.needs.more.than| (size of this area is unknown at
65// |.the.standard.16-byte.alignment....| compile time; if present)
66// |-----------------------------------|
67// | |
68// | local variables of fixed size |
69// | including spill slots |
70// |-----------------------------------| <- bp(not defined by ABI,
71// |.variable-sized.local.variables....| LLVM chooses X19)
72// |.(VLAs)............................| (size of this area is unknown at
73// |...................................| compile time)
74// |-----------------------------------| <- sp
75// | | Lower address
76//
77//
78// To access the data in a frame, at-compile time, a constant offset must be
79// computable from one of the pointers (fp, bp, sp) to access it. The size
80// of the areas with a dotted background cannot be computed at compile-time
81// if they are present, making it required to have all three of fp, bp and
82// sp to be set up to be able to access all contents in the frame areas,
83// assuming all of the frame areas are non-empty.
84//
85// For most functions, some of the frame areas are empty. For those functions,
86// it may not be necessary to set up fp or bp:
87// * A base pointer is definitely needed when there are both VLAs and local
88// variables with more-than-default alignment requirements.
89// * A frame pointer is definitely needed when there are local variables with
90// more-than-default alignment requirements.
91//
92// For Darwin platforms the frame-record (fp, lr) is stored at the top of the
93// callee-saved area, since the unwind encoding does not allow for encoding
94// this dynamically and existing tools depend on this layout. For other
95// platforms, the frame-record is stored at the bottom of the (gpr) callee-saved
96// area to allow SVE stack objects (allocated directly below the callee-saves,
97// if available) to be accessed directly from the framepointer.
98// The SVE spill/fill instructions have VL-scaled addressing modes such
99// as:
100// ldr z8, [fp, #-7 mul vl]
101// For SVE the size of the vector length (VL) is not known at compile-time, so
102// '#-7 mul vl' is an offset that can only be evaluated at runtime. With this
103// layout, we don't need to add an unscaled offset to the framepointer before
104// accessing the SVE object in the frame.
105//
106// In some cases when a base pointer is not strictly needed, it is generated
107// anyway when offsets from the frame pointer to access local variables become
108// so large that the offset can't be encoded in the immediate fields of loads
109// or stores.
110//
111// Outgoing function arguments must be at the bottom of the stack frame when
112// calling another function. If we do not have variable-sized stack objects, we
113// can allocate a "reserved call frame" area at the bottom of the local
114// variable area, large enough for all outgoing calls. If we do have VLAs, then
115// the stack pointer must be decremented and incremented around each call to
116// make space for the arguments below the VLAs.
117//
118// FIXME: also explain the redzone concept.
119//
120// An example of the prologue:
121//
122// .globl __foo
123// .align 2
124// __foo:
125// Ltmp0:
126// .cfi_startproc
127// .cfi_personality 155, ___gxx_personality_v0
128// Leh_func_begin:
129// .cfi_lsda 16, Lexception33
130//
131// stp xa,bx, [sp, -#offset]!
132// ...
133// stp x28, x27, [sp, #offset-32]
134// stp fp, lr, [sp, #offset-16]
135// add fp, sp, #offset - 16
136// sub sp, sp, #1360
137//
138// The Stack:
139// +-------------------------------------------+
140// 10000 | ........ | ........ | ........ | ........ |
141// 10004 | ........ | ........ | ........ | ........ |
142// +-------------------------------------------+
143// 10008 | ........ | ........ | ........ | ........ |
144// 1000c | ........ | ........ | ........ | ........ |
145// +===========================================+
146// 10010 | X28 Register |
147// 10014 | X28 Register |
148// +-------------------------------------------+
149// 10018 | X27 Register |
150// 1001c | X27 Register |
151// +===========================================+
152// 10020 | Frame Pointer |
153// 10024 | Frame Pointer |
154// +-------------------------------------------+
155// 10028 | Link Register |
156// 1002c | Link Register |
157// +===========================================+
158// 10030 | ........ | ........ | ........ | ........ |
159// 10034 | ........ | ........ | ........ | ........ |
160// +-------------------------------------------+
161// 10038 | ........ | ........ | ........ | ........ |
162// 1003c | ........ | ........ | ........ | ........ |
163// +-------------------------------------------+
164//
165// [sp] = 10030 :: >>initial value<<
166// sp = 10020 :: stp fp, lr, [sp, #-16]!
167// fp = sp == 10020 :: mov fp, sp
168// [sp] == 10020 :: stp x28, x27, [sp, #-16]!
169// sp == 10010 :: >>final value<<
170//
171// The frame pointer (w29) points to address 10020. If we use an offset of
172// '16' from 'w29', we get the CFI offsets of -8 for w30, -16 for w29, -24
173// for w27, and -32 for w28:
174//
175// Ltmp1:
176// .cfi_def_cfa w29, 16
177// Ltmp2:
178// .cfi_offset w30, -8
179// Ltmp3:
180// .cfi_offset w29, -16
181// Ltmp4:
182// .cfi_offset w27, -24
183// Ltmp5:
184// .cfi_offset w28, -32
185//
186//===----------------------------------------------------------------------===//
187
188#include "AArch64FrameLowering.h"
189#include "AArch64InstrInfo.h"
191#include "AArch64RegisterInfo.h"
192#include "AArch64Subtarget.h"
193#include "AArch64TargetMachine.h"
196#include "llvm/ADT/ScopeExit.h"
197#include "llvm/ADT/SmallVector.h"
198#include "llvm/ADT/Statistic.h"
214#include "llvm/IR/Attributes.h"
215#include "llvm/IR/CallingConv.h"
216#include "llvm/IR/DataLayout.h"
217#include "llvm/IR/DebugLoc.h"
218#include "llvm/IR/Function.h"
219#include "llvm/MC/MCAsmInfo.h"
220#include "llvm/MC/MCDwarf.h"
222#include "llvm/Support/Debug.h"
228#include <cassert>
229#include <cstdint>
230#include <iterator>
231#include <optional>
232#include <vector>
233
234using namespace llvm;
235
236#define DEBUG_TYPE "frame-info"
237
238static cl::opt<bool> EnableRedZone("aarch64-redzone",
239 cl::desc("enable use of redzone on AArch64"),
240 cl::init(false), cl::Hidden);
241
243 "stack-tagging-merge-settag",
244 cl::desc("merge settag instruction in function epilog"), cl::init(true),
245 cl::Hidden);
246
247static cl::opt<bool> OrderFrameObjects("aarch64-order-frame-objects",
248 cl::desc("sort stack allocations"),
249 cl::init(true), cl::Hidden);
250
252 "homogeneous-prolog-epilog", cl::Hidden,
253 cl::desc("Emit homogeneous prologue and epilogue for the size "
254 "optimization (default = off)"));
255
256STATISTIC(NumRedZoneFunctions, "Number of functions using red zone");
257
258/// Returns how much of the incoming argument stack area (in bytes) we should
259/// clean up in an epilogue. For the C calling convention this will be 0, for
260/// guaranteed tail call conventions it can be positive (a normal return or a
261/// tail call to a function that uses less stack space for arguments) or
262/// negative (for a tail call to a function that needs more stack space than us
263/// for arguments).
268 bool IsTailCallReturn = (MBB.end() != MBBI)
270 : false;
271
272 int64_t ArgumentPopSize = 0;
273 if (IsTailCallReturn) {
274 MachineOperand &StackAdjust = MBBI->getOperand(1);
275
276 // For a tail-call in a callee-pops-arguments environment, some or all of
277 // the stack may actually be in use for the call's arguments, this is
278 // calculated during LowerCall and consumed here...
279 ArgumentPopSize = StackAdjust.getImm();
280 } else {
281 // ... otherwise the amount to pop is *all* of the argument space,
282 // conveniently stored in the MachineFunctionInfo by
283 // LowerFormalArguments. This will, of course, be zero for the C calling
284 // convention.
285 ArgumentPopSize = AFI->getArgumentStackToRestore();
286 }
287
288 return ArgumentPopSize;
289}
290
292static bool needsWinCFI(const MachineFunction &MF);
295
296/// Returns true if a homogeneous prolog or epilog code can be emitted
297/// for the size optimization. If possible, a frame helper call is injected.
298/// When Exit block is given, this check is for epilog.
299bool AArch64FrameLowering::homogeneousPrologEpilog(
300 MachineFunction &MF, MachineBasicBlock *Exit) const {
301 if (!MF.getFunction().hasMinSize())
302 return false;
304 return false;
305 if (EnableRedZone)
306 return false;
307
308 // TODO: Window is supported yet.
309 if (needsWinCFI(MF))
310 return false;
311 // TODO: SVE is not supported yet.
312 if (getSVEStackSize(MF))
313 return false;
314
315 // Bail on stack adjustment needed on return for simplicity.
316 const MachineFrameInfo &MFI = MF.getFrameInfo();
318 if (MFI.hasVarSizedObjects() || RegInfo->hasStackRealignment(MF))
319 return false;
320 if (Exit && getArgumentStackToRestore(MF, *Exit))
321 return false;
322
323 auto *AFI = MF.getInfo<AArch64FunctionInfo>();
324 if (AFI->hasSwiftAsyncContext())
325 return false;
326
327 // If there are an odd number of GPRs before LR and FP in the CSRs list,
328 // they will not be paired into one RegPairInfo, which is incompatible with
329 // the assumption made by the homogeneous prolog epilog pass.
330 const MCPhysReg *CSRegs = MF.getRegInfo().getCalleeSavedRegs();
331 unsigned NumGPRs = 0;
332 for (unsigned I = 0; CSRegs[I]; ++I) {
333 Register Reg = CSRegs[I];
334 if (Reg == AArch64::LR) {
335 assert(CSRegs[I + 1] == AArch64::FP);
336 if (NumGPRs % 2 != 0)
337 return false;
338 break;
339 }
340 if (AArch64::GPR64RegClass.contains(Reg))
341 ++NumGPRs;
342 }
343
344 return true;
345}
346
347/// Returns true if CSRs should be paired.
348bool AArch64FrameLowering::producePairRegisters(MachineFunction &MF) const {
349 return produceCompactUnwindFrame(MF) || homogeneousPrologEpilog(MF);
350}
351
352/// This is the biggest offset to the stack pointer we can encode in aarch64
353/// instructions (without using a separate calculation and a temp register).
354/// Note that the exception here are vector stores/loads which cannot encode any
355/// displacements (see estimateRSStackSizeLimit(), isAArch64FrameOffsetLegal()).
356static const unsigned DefaultSafeSPDisplacement = 255;
357
358/// Look at each instruction that references stack frames and return the stack
359/// size limit beyond which some of these instructions will require a scratch
360/// register during their expansion later.
362 // FIXME: For now, just conservatively guestimate based on unscaled indexing
363 // range. We'll end up allocating an unnecessary spill slot a lot, but
364 // realistically that's not a big deal at this stage of the game.
365 for (MachineBasicBlock &MBB : MF) {
366 for (MachineInstr &MI : MBB) {
367 if (MI.isDebugInstr() || MI.isPseudo() ||
368 MI.getOpcode() == AArch64::ADDXri ||
369 MI.getOpcode() == AArch64::ADDSXri)
370 continue;
371
372 for (const MachineOperand &MO : MI.operands()) {
373 if (!MO.isFI())
374 continue;
375
377 if (isAArch64FrameOffsetLegal(MI, Offset, nullptr, nullptr, nullptr) ==
379 return 0;
380 }
381 }
382 }
384}
385
389}
390
391/// Returns the size of the fixed object area (allocated next to sp on entry)
392/// On Win64 this may include a var args area and an UnwindHelp object for EH.
393static unsigned getFixedObjectSize(const MachineFunction &MF,
394 const AArch64FunctionInfo *AFI, bool IsWin64,
395 bool IsFunclet) {
396 if (!IsWin64 || IsFunclet) {
397 return AFI->getTailCallReservedStack();
398 } else {
399 if (AFI->getTailCallReservedStack() != 0 &&
401 Attribute::SwiftAsync))
402 report_fatal_error("cannot generate ABI-changing tail call for Win64");
403 // Var args are stored here in the primary function.
404 const unsigned VarArgsArea = AFI->getVarArgsGPRSize();
405 // To support EH funclets we allocate an UnwindHelp object
406 const unsigned UnwindHelpObject = (MF.hasEHFunclets() ? 8 : 0);
407 return AFI->getTailCallReservedStack() +
408 alignTo(VarArgsArea + UnwindHelpObject, 16);
409 }
410}
411
412/// Returns the size of the entire SVE stackframe (calleesaves + spills).
415 return StackOffset::getScalable((int64_t)AFI->getStackSizeSVE());
416}
417
419 if (!EnableRedZone)
420 return false;
421
422 // Don't use the red zone if the function explicitly asks us not to.
423 // This is typically used for kernel code.
424 const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
425 const unsigned RedZoneSize =
427 if (!RedZoneSize)
428 return false;
429
430 const MachineFrameInfo &MFI = MF.getFrameInfo();
432 uint64_t NumBytes = AFI->getLocalStackSize();
433
434 return !(MFI.hasCalls() || hasFP(MF) || NumBytes > RedZoneSize ||
435 getSVEStackSize(MF));
436}
437
438/// hasFP - Return true if the specified function should have a dedicated frame
439/// pointer register.
441 const MachineFrameInfo &MFI = MF.getFrameInfo();
442 const TargetRegisterInfo *RegInfo = MF.getSubtarget().getRegisterInfo();
443
444 // Win64 EH requires a frame pointer if funclets are present, as the locals
445 // are accessed off the frame pointer in both the parent function and the
446 // funclets.
447 if (MF.hasEHFunclets())
448 return true;
449 // Retain behavior of always omitting the FP for leaf functions when possible.
451 return true;
452 if (MFI.hasVarSizedObjects() || MFI.isFrameAddressTaken() ||
453 MFI.hasStackMap() || MFI.hasPatchPoint() ||
454 RegInfo->hasStackRealignment(MF))
455 return true;
456 // With large callframes around we may need to use FP to access the scavenging
457 // emergency spillslot.
458 //
459 // Unfortunately some calls to hasFP() like machine verifier ->
460 // getReservedReg() -> hasFP in the middle of global isel are too early
461 // to know the max call frame size. Hopefully conservatively returning "true"
462 // in those cases is fine.
463 // DefaultSafeSPDisplacement is fine as we only emergency spill GP regs.
464 if (!MFI.isMaxCallFrameSizeComputed() ||
466 return true;
467
468 return false;
469}
470
471/// hasReservedCallFrame - Under normal circumstances, when a frame pointer is
472/// not required, we reserve argument space for call sites in the function
473/// immediately on entry to the current function. This eliminates the need for
474/// add/sub sp brackets around call sites. Returns true if the call frame is
475/// included as part of the stack frame.
476bool
478 // The stack probing code for the dynamically allocated outgoing arguments
479 // area assumes that the stack is probed at the top - either by the prologue
480 // code, which issues a probe if `hasVarSizedObjects` return true, or by the
481 // most recent variable-sized object allocation. Changing the condition here
482 // may need to be followed up by changes to the probe issuing logic.
483 return !MF.getFrameInfo().hasVarSizedObjects();
484}
485
489 const AArch64InstrInfo *TII =
490 static_cast<const AArch64InstrInfo *>(MF.getSubtarget().getInstrInfo());
491 const AArch64TargetLowering *TLI =
492 MF.getSubtarget<AArch64Subtarget>().getTargetLowering();
493 [[maybe_unused]] MachineFrameInfo &MFI = MF.getFrameInfo();
494 DebugLoc DL = I->getDebugLoc();
495 unsigned Opc = I->getOpcode();
496 bool IsDestroy = Opc == TII->getCallFrameDestroyOpcode();
497 uint64_t CalleePopAmount = IsDestroy ? I->getOperand(1).getImm() : 0;
498
499 if (!hasReservedCallFrame(MF)) {
500 int64_t Amount = I->getOperand(0).getImm();
501 Amount = alignTo(Amount, getStackAlign());
502 if (!IsDestroy)
503 Amount = -Amount;
504
505 // N.b. if CalleePopAmount is valid but zero (i.e. callee would pop, but it
506 // doesn't have to pop anything), then the first operand will be zero too so
507 // this adjustment is a no-op.
508 if (CalleePopAmount == 0) {
509 // FIXME: in-function stack adjustment for calls is limited to 24-bits
510 // because there's no guaranteed temporary register available.
511 //
512 // ADD/SUB (immediate) has only LSL #0 and LSL #12 available.
513 // 1) For offset <= 12-bit, we use LSL #0
514 // 2) For 12-bit <= offset <= 24-bit, we use two instructions. One uses
515 // LSL #0, and the other uses LSL #12.
516 //
517 // Most call frames will be allocated at the start of a function so
518 // this is OK, but it is a limitation that needs dealing with.
519 assert(Amount > -0xffffff && Amount < 0xffffff && "call frame too large");
520
521 if (TLI->hasInlineStackProbe(MF) &&
523 // When stack probing is enabled, the decrement of SP may need to be
524 // probed. We only need to do this if the call site needs 1024 bytes of
525 // space or more, because a region smaller than that is allowed to be
526 // unprobed at an ABI boundary. We rely on the fact that SP has been
527 // probed exactly at this point, either by the prologue or most recent
528 // dynamic allocation.
530 "non-reserved call frame without var sized objects?");
531 Register ScratchReg =
532 MF.getRegInfo().createVirtualRegister(&AArch64::GPR64RegClass);
533 inlineStackProbeFixed(I, ScratchReg, -Amount, StackOffset::get(0, 0));
534 } else {
535 emitFrameOffset(MBB, I, DL, AArch64::SP, AArch64::SP,
536 StackOffset::getFixed(Amount), TII);
537 }
538 }
539 } else if (CalleePopAmount != 0) {
540 // If the calling convention demands that the callee pops arguments from the
541 // stack, we want to add it back if we have a reserved call frame.
542 assert(CalleePopAmount < 0xffffff && "call frame too large");
543 emitFrameOffset(MBB, I, DL, AArch64::SP, AArch64::SP,
544 StackOffset::getFixed(-(int64_t)CalleePopAmount), TII);
545 }
546 return MBB.erase(I);
547}
548
549void AArch64FrameLowering::emitCalleeSavedGPRLocations(
552 MachineFrameInfo &MFI = MF.getFrameInfo();
553
554 const std::vector<CalleeSavedInfo> &CSI = MFI.getCalleeSavedInfo();
555 if (CSI.empty())
556 return;
557
558 const TargetSubtargetInfo &STI = MF.getSubtarget();
559 const TargetRegisterInfo &TRI = *STI.getRegisterInfo();
560 const TargetInstrInfo &TII = *STI.getInstrInfo();
562
563 for (const auto &Info : CSI) {
564 if (MFI.getStackID(Info.getFrameIdx()) == TargetStackID::ScalableVector)
565 continue;
566
567 assert(!Info.isSpilledToReg() && "Spilling to registers not implemented");
568 unsigned DwarfReg = TRI.getDwarfRegNum(Info.getReg(), true);
569
570 int64_t Offset =
571 MFI.getObjectOffset(Info.getFrameIdx()) - getOffsetOfLocalArea();
572 unsigned CFIIndex = MF.addFrameInst(
573 MCCFIInstruction::createOffset(nullptr, DwarfReg, Offset));
574 BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
575 .addCFIIndex(CFIIndex)
577 }
578}
579
580void AArch64FrameLowering::emitCalleeSavedSVELocations(
583 MachineFrameInfo &MFI = MF.getFrameInfo();
584
585 // Add callee saved registers to move list.
586 const std::vector<CalleeSavedInfo> &CSI = MFI.getCalleeSavedInfo();
587 if (CSI.empty())
588 return;
589
590 const TargetSubtargetInfo &STI = MF.getSubtarget();
591 const TargetRegisterInfo &TRI = *STI.getRegisterInfo();
592 const TargetInstrInfo &TII = *STI.getInstrInfo();
595
596 for (const auto &Info : CSI) {
597 if (!(MFI.getStackID(Info.getFrameIdx()) == TargetStackID::ScalableVector))
598 continue;
599
600 // Not all unwinders may know about SVE registers, so assume the lowest
601 // common demoninator.
602 assert(!Info.isSpilledToReg() && "Spilling to registers not implemented");
603 unsigned Reg = Info.getReg();
604 if (!static_cast<const AArch64RegisterInfo &>(TRI).regNeedsCFI(Reg, Reg))
605 continue;
606
608 StackOffset::getScalable(MFI.getObjectOffset(Info.getFrameIdx())) -
610
611 unsigned CFIIndex = MF.addFrameInst(createCFAOffset(TRI, Reg, Offset));
612 BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
613 .addCFIIndex(CFIIndex)
615 }
616}
617
621 unsigned DwarfReg) {
622 unsigned CFIIndex =
623 MF.addFrameInst(MCCFIInstruction::createSameValue(nullptr, DwarfReg));
624 BuildMI(MBB, InsertPt, DebugLoc(), Desc).addCFIIndex(CFIIndex);
625}
626
628 MachineBasicBlock &MBB) const {
629
631 const auto &Subtarget = MF.getSubtarget<AArch64Subtarget>();
632 const TargetInstrInfo &TII = *Subtarget.getInstrInfo();
633 const auto &TRI =
634 static_cast<const AArch64RegisterInfo &>(*Subtarget.getRegisterInfo());
635 const auto &MFI = *MF.getInfo<AArch64FunctionInfo>();
636
637 const MCInstrDesc &CFIDesc = TII.get(TargetOpcode::CFI_INSTRUCTION);
638 DebugLoc DL;
639
640 // Reset the CFA to `SP + 0`.
642 unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::cfiDefCfa(
643 nullptr, TRI.getDwarfRegNum(AArch64::SP, true), 0));
644 BuildMI(MBB, InsertPt, DL, CFIDesc).addCFIIndex(CFIIndex);
645
646 // Flip the RA sign state.
647 if (MFI.shouldSignReturnAddress(MF)) {
649 BuildMI(MBB, InsertPt, DL, CFIDesc).addCFIIndex(CFIIndex);
650 }
651
652 // Shadow call stack uses X18, reset it.
653 if (MFI.needsShadowCallStackPrologueEpilogue(MF))
654 insertCFISameValue(CFIDesc, MF, MBB, InsertPt,
655 TRI.getDwarfRegNum(AArch64::X18, true));
656
657 // Emit .cfi_same_value for callee-saved registers.
658 const std::vector<CalleeSavedInfo> &CSI =
660 for (const auto &Info : CSI) {
661 unsigned Reg = Info.getReg();
662 if (!TRI.regNeedsCFI(Reg, Reg))
663 continue;
664 insertCFISameValue(CFIDesc, MF, MBB, InsertPt,
665 TRI.getDwarfRegNum(Reg, true));
666 }
667}
668
671 bool SVE) {
673 MachineFrameInfo &MFI = MF.getFrameInfo();
674
675 const std::vector<CalleeSavedInfo> &CSI = MFI.getCalleeSavedInfo();
676 if (CSI.empty())
677 return;
678
679 const TargetSubtargetInfo &STI = MF.getSubtarget();
680 const TargetRegisterInfo &TRI = *STI.getRegisterInfo();
681 const TargetInstrInfo &TII = *STI.getInstrInfo();
683
684 for (const auto &Info : CSI) {
685 if (SVE !=
686 (MFI.getStackID(Info.getFrameIdx()) == TargetStackID::ScalableVector))
687 continue;
688
689 unsigned Reg = Info.getReg();
690 if (SVE &&
691 !static_cast<const AArch64RegisterInfo &>(TRI).regNeedsCFI(Reg, Reg))
692 continue;
693
694 unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createRestore(
695 nullptr, TRI.getDwarfRegNum(Info.getReg(), true)));
696 BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
697 .addCFIIndex(CFIIndex)
699 }
700}
701
702void AArch64FrameLowering::emitCalleeSavedGPRRestores(
705}
706
707void AArch64FrameLowering::emitCalleeSavedSVERestores(
710}
711
712// Return the maximum possible number of bytes for `Size` due to the
713// architectural limit on the size of a SVE register.
714static int64_t upperBound(StackOffset Size) {
715 static const int64_t MAX_BYTES_PER_SCALABLE_BYTE = 16;
716 return Size.getScalable() * MAX_BYTES_PER_SCALABLE_BYTE + Size.getFixed();
717}
718
719void AArch64FrameLowering::allocateStackSpace(
721 int64_t RealignmentPadding, StackOffset AllocSize, bool NeedsWinCFI,
722 bool *HasWinCFI, bool EmitCFI, StackOffset InitialOffset,
723 bool FollowupAllocs) const {
724
725 if (!AllocSize)
726 return;
727
728 DebugLoc DL;
730 const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
731 const TargetInstrInfo &TII = *Subtarget.getInstrInfo();
733 const MachineFrameInfo &MFI = MF.getFrameInfo();
734
735 const int64_t MaxAlign = MFI.getMaxAlign().value();
736 const uint64_t AndMask = ~(MaxAlign - 1);
737
738 if (!Subtarget.getTargetLowering()->hasInlineStackProbe(MF)) {
739 Register TargetReg = RealignmentPadding
741 : AArch64::SP;
742 // SUB Xd/SP, SP, AllocSize
743 emitFrameOffset(MBB, MBBI, DL, TargetReg, AArch64::SP, -AllocSize, &TII,
744 MachineInstr::FrameSetup, false, NeedsWinCFI, HasWinCFI,
745 EmitCFI, InitialOffset);
746
747 if (RealignmentPadding) {
748 // AND SP, X9, 0b11111...0000
749 BuildMI(MBB, MBBI, DL, TII.get(AArch64::ANDXri), AArch64::SP)
750 .addReg(TargetReg, RegState::Kill)
753 AFI.setStackRealigned(true);
754
755 // No need for SEH instructions here; if we're realigning the stack,
756 // we've set a frame pointer and already finished the SEH prologue.
757 assert(!NeedsWinCFI);
758 }
759 return;
760 }
761
762 //
763 // Stack probing allocation.
764 //
765
766 // Fixed length allocation. If we don't need to re-align the stack and don't
767 // have SVE objects, we can use a more efficient sequence for stack probing.
768 if (AllocSize.getScalable() == 0 && RealignmentPadding == 0) {
770 assert(ScratchReg != AArch64::NoRegister);
771 BuildMI(MBB, MBBI, DL, TII.get(AArch64::PROBED_STACKALLOC))
772 .addDef(ScratchReg)
773 .addImm(AllocSize.getFixed())
774 .addImm(InitialOffset.getFixed())
775 .addImm(InitialOffset.getScalable());
776 // The fixed allocation may leave unprobed bytes at the top of the
777 // stack. If we have subsequent alocation (e.g. if we have variable-sized
778 // objects), we need to issue an extra probe, so these allocations start in
779 // a known state.
780 if (FollowupAllocs) {
781 // STR XZR, [SP]
782 BuildMI(MBB, MBBI, DL, TII.get(AArch64::STRXui))
783 .addReg(AArch64::XZR)
784 .addReg(AArch64::SP)
785 .addImm(0)
787 }
788
789 return;
790 }
791
792 // Variable length allocation.
793
794 // If the (unknown) allocation size cannot exceed the probe size, decrement
795 // the stack pointer right away.
796 int64_t ProbeSize = AFI.getStackProbeSize();
797 if (upperBound(AllocSize) + RealignmentPadding <= ProbeSize) {
798 Register ScratchReg = RealignmentPadding
800 : AArch64::SP;
801 assert(ScratchReg != AArch64::NoRegister);
802 // SUB Xd, SP, AllocSize
803 emitFrameOffset(MBB, MBBI, DL, ScratchReg, AArch64::SP, -AllocSize, &TII,
804 MachineInstr::FrameSetup, false, NeedsWinCFI, HasWinCFI,
805 EmitCFI, InitialOffset);
806 if (RealignmentPadding) {
807 // AND SP, Xn, 0b11111...0000
808 BuildMI(MBB, MBBI, DL, TII.get(AArch64::ANDXri), AArch64::SP)
809 .addReg(ScratchReg, RegState::Kill)
812 AFI.setStackRealigned(true);
813 }
814 if (FollowupAllocs || upperBound(AllocSize) + RealignmentPadding >
816 // STR XZR, [SP]
817 BuildMI(MBB, MBBI, DL, TII.get(AArch64::STRXui))
818 .addReg(AArch64::XZR)
819 .addReg(AArch64::SP)
820 .addImm(0)
822 }
823 return;
824 }
825
826 // Emit a variable-length allocation probing loop.
827 // TODO: As an optimisation, the loop can be "unrolled" into a few parts,
828 // each of them guaranteed to adjust the stack by less than the probe size.
830 assert(TargetReg != AArch64::NoRegister);
831 // SUB Xd, SP, AllocSize
832 emitFrameOffset(MBB, MBBI, DL, TargetReg, AArch64::SP, -AllocSize, &TII,
833 MachineInstr::FrameSetup, false, NeedsWinCFI, HasWinCFI,
834 EmitCFI, InitialOffset);
835 if (RealignmentPadding) {
836 // AND Xn, Xn, 0b11111...0000
837 BuildMI(MBB, MBBI, DL, TII.get(AArch64::ANDXri), TargetReg)
838 .addReg(TargetReg, RegState::Kill)
841 }
842
843 BuildMI(MBB, MBBI, DL, TII.get(AArch64::PROBED_STACKALLOC_VAR))
844 .addReg(TargetReg);
845 if (EmitCFI) {
846 // Set the CFA register back to SP.
847 unsigned Reg =
848 Subtarget.getRegisterInfo()->getDwarfRegNum(AArch64::SP, true);
849 unsigned CFIIndex =
851 BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
852 .addCFIIndex(CFIIndex)
854 }
855 if (RealignmentPadding)
856 AFI.setStackRealigned(true);
857}
858
859static MCRegister getRegisterOrZero(MCRegister Reg, bool HasSVE) {
860 switch (Reg.id()) {
861 default:
862 // The called routine is expected to preserve r19-r28
863 // r29 and r30 are used as frame pointer and link register resp.
864 return 0;
865
866 // GPRs
867#define CASE(n) \
868 case AArch64::W##n: \
869 case AArch64::X##n: \
870 return AArch64::X##n
871 CASE(0);
872 CASE(1);
873 CASE(2);
874 CASE(3);
875 CASE(4);
876 CASE(5);
877 CASE(6);
878 CASE(7);
879 CASE(8);
880 CASE(9);
881 CASE(10);
882 CASE(11);
883 CASE(12);
884 CASE(13);
885 CASE(14);
886 CASE(15);
887 CASE(16);
888 CASE(17);
889 CASE(18);
890#undef CASE
891
892 // FPRs
893#define CASE(n) \
894 case AArch64::B##n: \
895 case AArch64::H##n: \
896 case AArch64::S##n: \
897 case AArch64::D##n: \
898 case AArch64::Q##n: \
899 return HasSVE ? AArch64::Z##n : AArch64::Q##n
900 CASE(0);
901 CASE(1);
902 CASE(2);
903 CASE(3);
904 CASE(4);
905 CASE(5);
906 CASE(6);
907 CASE(7);
908 CASE(8);
909 CASE(9);
910 CASE(10);
911 CASE(11);
912 CASE(12);
913 CASE(13);
914 CASE(14);
915 CASE(15);
916 CASE(16);
917 CASE(17);
918 CASE(18);
919 CASE(19);
920 CASE(20);
921 CASE(21);
922 CASE(22);
923 CASE(23);
924 CASE(24);
925 CASE(25);
926 CASE(26);
927 CASE(27);
928 CASE(28);
929 CASE(29);
930 CASE(30);
931 CASE(31);
932#undef CASE
933 }
934}
935
936void AArch64FrameLowering::emitZeroCallUsedRegs(BitVector RegsToZero,
937 MachineBasicBlock &MBB) const {
938 // Insertion point.
940
941 // Fake a debug loc.
942 DebugLoc DL;
943 if (MBBI != MBB.end())
944 DL = MBBI->getDebugLoc();
945
946 const MachineFunction &MF = *MBB.getParent();
949
950 BitVector GPRsToZero(TRI.getNumRegs());
951 BitVector FPRsToZero(TRI.getNumRegs());
952 bool HasSVE = STI.hasSVE();
953 for (MCRegister Reg : RegsToZero.set_bits()) {
954 if (TRI.isGeneralPurposeRegister(MF, Reg)) {
955 // For GPRs, we only care to clear out the 64-bit register.
956 if (MCRegister XReg = getRegisterOrZero(Reg, HasSVE))
957 GPRsToZero.set(XReg);
958 } else if (AArch64::FPR128RegClass.contains(Reg) ||
959 AArch64::FPR64RegClass.contains(Reg) ||
960 AArch64::FPR32RegClass.contains(Reg) ||
961 AArch64::FPR16RegClass.contains(Reg) ||
962 AArch64::FPR8RegClass.contains(Reg)) {
963 // For FPRs,
964 if (MCRegister XReg = getRegisterOrZero(Reg, HasSVE))
965 FPRsToZero.set(XReg);
966 }
967 }
968
969 const AArch64InstrInfo &TII = *STI.getInstrInfo();
970
971 // Zero out GPRs.
972 for (MCRegister Reg : GPRsToZero.set_bits())
973 TII.buildClearRegister(Reg, MBB, MBBI, DL);
974
975 // Zero out FP/vector registers.
976 for (MCRegister Reg : FPRsToZero.set_bits())
977 TII.buildClearRegister(Reg, MBB, MBBI, DL);
978
979 if (HasSVE) {
980 for (MCRegister PReg :
981 {AArch64::P0, AArch64::P1, AArch64::P2, AArch64::P3, AArch64::P4,
982 AArch64::P5, AArch64::P6, AArch64::P7, AArch64::P8, AArch64::P9,
983 AArch64::P10, AArch64::P11, AArch64::P12, AArch64::P13, AArch64::P14,
984 AArch64::P15}) {
985 if (RegsToZero[PReg])
986 BuildMI(MBB, MBBI, DL, TII.get(AArch64::PFALSE), PReg);
987 }
988 }
989}
990
992 const MachineBasicBlock &MBB) {
993 const MachineFunction *MF = MBB.getParent();
994 LiveRegs.addLiveIns(MBB);
995 // Mark callee saved registers as used so we will not choose them.
996 const MCPhysReg *CSRegs = MF->getRegInfo().getCalleeSavedRegs();
997 for (unsigned i = 0; CSRegs[i]; ++i)
998 LiveRegs.addReg(CSRegs[i]);
999}
1000
1001// Find a scratch register that we can use at the start of the prologue to
1002// re-align the stack pointer. We avoid using callee-save registers since they
1003// may appear to be free when this is called from canUseAsPrologue (during
1004// shrink wrapping), but then no longer be free when this is called from
1005// emitPrologue.
1006//
1007// FIXME: This is a bit conservative, since in the above case we could use one
1008// of the callee-save registers as a scratch temp to re-align the stack pointer,
1009// but we would then have to make sure that we were in fact saving at least one
1010// callee-save register in the prologue, which is additional complexity that
1011// doesn't seem worth the benefit.
1013 MachineFunction *MF = MBB->getParent();
1014
1015 // If MBB is an entry block, use X9 as the scratch register
1016 if (&MF->front() == MBB)
1017 return AArch64::X9;
1018
1019 const AArch64Subtarget &Subtarget = MF->getSubtarget<AArch64Subtarget>();
1020 const AArch64RegisterInfo &TRI = *Subtarget.getRegisterInfo();
1021 LivePhysRegs LiveRegs(TRI);
1022 getLiveRegsForEntryMBB(LiveRegs, *MBB);
1023
1024 // Prefer X9 since it was historically used for the prologue scratch reg.
1025 const MachineRegisterInfo &MRI = MF->getRegInfo();
1026 if (LiveRegs.available(MRI, AArch64::X9))
1027 return AArch64::X9;
1028
1029 for (unsigned Reg : AArch64::GPR64RegClass) {
1030 if (LiveRegs.available(MRI, Reg))
1031 return Reg;
1032 }
1033 return AArch64::NoRegister;
1034}
1035
1037 const MachineBasicBlock &MBB) const {
1038 const MachineFunction *MF = MBB.getParent();
1039 MachineBasicBlock *TmpMBB = const_cast<MachineBasicBlock *>(&MBB);
1040 const AArch64Subtarget &Subtarget = MF->getSubtarget<AArch64Subtarget>();
1041 const AArch64RegisterInfo *RegInfo = Subtarget.getRegisterInfo();
1042 const AArch64TargetLowering *TLI = Subtarget.getTargetLowering();
1044
1045 if (AFI->hasSwiftAsyncContext()) {
1046 const AArch64RegisterInfo &TRI = *Subtarget.getRegisterInfo();
1047 const MachineRegisterInfo &MRI = MF->getRegInfo();
1048 LivePhysRegs LiveRegs(TRI);
1049 getLiveRegsForEntryMBB(LiveRegs, MBB);
1050 // The StoreSwiftAsyncContext clobbers X16 and X17. Make sure they are
1051 // available.
1052 if (!LiveRegs.available(MRI, AArch64::X16) ||
1053 !LiveRegs.available(MRI, AArch64::X17))
1054 return false;
1055 }
1056
1057 // Certain stack probing sequences might clobber flags, then we can't use
1058 // the block as a prologue if the flags register is a live-in.
1060 MBB.isLiveIn(AArch64::NZCV))
1061 return false;
1062
1063 // Don't need a scratch register if we're not going to re-align the stack or
1064 // emit stack probes.
1065 if (!RegInfo->hasStackRealignment(*MF) && !TLI->hasInlineStackProbe(*MF))
1066 return true;
1067 // Otherwise, we can use any block as long as it has a scratch register
1068 // available.
1069 return findScratchNonCalleeSaveRegister(TmpMBB) != AArch64::NoRegister;
1070}
1071
1073 uint64_t StackSizeInBytes) {
1074 const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
1076 // TODO: When implementing stack protectors, take that into account
1077 // for the probe threshold.
1078 return Subtarget.isTargetWindows() && MFI.hasStackProbing() &&
1079 StackSizeInBytes >= uint64_t(MFI.getStackProbeSize());
1080}
1081
1082static bool needsWinCFI(const MachineFunction &MF) {
1083 const Function &F = MF.getFunction();
1084 return MF.getTarget().getMCAsmInfo()->usesWindowsCFI() &&
1085 F.needsUnwindTableEntry();
1086}
1087
1088bool AArch64FrameLowering::shouldCombineCSRLocalStackBump(
1089 MachineFunction &MF, uint64_t StackBumpBytes) const {
1091 const MachineFrameInfo &MFI = MF.getFrameInfo();
1092 const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
1093 const AArch64RegisterInfo *RegInfo = Subtarget.getRegisterInfo();
1094 if (homogeneousPrologEpilog(MF))
1095 return false;
1096
1097 if (AFI->getLocalStackSize() == 0)
1098 return false;
1099
1100 // For WinCFI, if optimizing for size, prefer to not combine the stack bump
1101 // (to force a stp with predecrement) to match the packed unwind format,
1102 // provided that there actually are any callee saved registers to merge the
1103 // decrement with.
1104 // This is potentially marginally slower, but allows using the packed
1105 // unwind format for functions that both have a local area and callee saved
1106 // registers. Using the packed unwind format notably reduces the size of
1107 // the unwind info.
1108 if (needsWinCFI(MF) && AFI->getCalleeSavedStackSize() > 0 &&
1109 MF.getFunction().hasOptSize())
1110 return false;
1111
1112 // 512 is the maximum immediate for stp/ldp that will be used for
1113 // callee-save save/restores
1114 if (StackBumpBytes >= 512 || windowsRequiresStackProbe(MF, StackBumpBytes))
1115 return false;
1116
1117 if (MFI.hasVarSizedObjects())
1118 return false;
1119
1120 if (RegInfo->hasStackRealignment(MF))
1121 return false;
1122
1123 // This isn't strictly necessary, but it simplifies things a bit since the
1124 // current RedZone handling code assumes the SP is adjusted by the
1125 // callee-save save/restore code.
1126 if (canUseRedZone(MF))
1127 return false;
1128
1129 // When there is an SVE area on the stack, always allocate the
1130 // callee-saves and spills/locals separately.
1131 if (getSVEStackSize(MF))
1132 return false;
1133
1134 return true;
1135}
1136
1137bool AArch64FrameLowering::shouldCombineCSRLocalStackBumpInEpilogue(
1138 MachineBasicBlock &MBB, unsigned StackBumpBytes) const {
1139 if (!shouldCombineCSRLocalStackBump(*MBB.getParent(), StackBumpBytes))
1140 return false;
1141
1142 if (MBB.empty())
1143 return true;
1144
1145 // Disable combined SP bump if the last instruction is an MTE tag store. It
1146 // is almost always better to merge SP adjustment into those instructions.
1149 while (LastI != Begin) {
1150 --LastI;
1151 if (LastI->isTransient())
1152 continue;
1153 if (!LastI->getFlag(MachineInstr::FrameDestroy))
1154 break;
1155 }
1156 switch (LastI->getOpcode()) {
1157 case AArch64::STGloop:
1158 case AArch64::STZGloop:
1159 case AArch64::STGi:
1160 case AArch64::STZGi:
1161 case AArch64::ST2Gi:
1162 case AArch64::STZ2Gi:
1163 return false;
1164 default:
1165 return true;
1166 }
1167 llvm_unreachable("unreachable");
1168}
1169
1170// Given a load or a store instruction, generate an appropriate unwinding SEH
1171// code on Windows.
1173 const TargetInstrInfo &TII,
1174 MachineInstr::MIFlag Flag) {
1175 unsigned Opc = MBBI->getOpcode();
1177 MachineFunction &MF = *MBB->getParent();
1178 DebugLoc DL = MBBI->getDebugLoc();
1179 unsigned ImmIdx = MBBI->getNumOperands() - 1;
1180 int Imm = MBBI->getOperand(ImmIdx).getImm();
1182 const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
1183 const AArch64RegisterInfo *RegInfo = Subtarget.getRegisterInfo();
1184
1185 switch (Opc) {
1186 default:
1187 llvm_unreachable("No SEH Opcode for this instruction");
1188 case AArch64::LDPDpost:
1189 Imm = -Imm;
1190 [[fallthrough]];
1191 case AArch64::STPDpre: {
1192 unsigned Reg0 = RegInfo->getSEHRegNum(MBBI->getOperand(1).getReg());
1193 unsigned Reg1 = RegInfo->getSEHRegNum(MBBI->getOperand(2).getReg());
1194 MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFRegP_X))
1195 .addImm(Reg0)
1196 .addImm(Reg1)
1197 .addImm(Imm * 8)
1198 .setMIFlag(Flag);
1199 break;
1200 }
1201 case AArch64::LDPXpost:
1202 Imm = -Imm;
1203 [[fallthrough]];
1204 case AArch64::STPXpre: {
1205 Register Reg0 = MBBI->getOperand(1).getReg();
1206 Register Reg1 = MBBI->getOperand(2).getReg();
1207 if (Reg0 == AArch64::FP && Reg1 == AArch64::LR)
1208 MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFPLR_X))
1209 .addImm(Imm * 8)
1210 .setMIFlag(Flag);
1211 else
1212 MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveRegP_X))
1213 .addImm(RegInfo->getSEHRegNum(Reg0))
1214 .addImm(RegInfo->getSEHRegNum(Reg1))
1215 .addImm(Imm * 8)
1216 .setMIFlag(Flag);
1217 break;
1218 }
1219 case AArch64::LDRDpost:
1220 Imm = -Imm;
1221 [[fallthrough]];
1222 case AArch64::STRDpre: {
1223 unsigned Reg = RegInfo->getSEHRegNum(MBBI->getOperand(1).getReg());
1224 MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFReg_X))
1225 .addImm(Reg)
1226 .addImm(Imm)
1227 .setMIFlag(Flag);
1228 break;
1229 }
1230 case AArch64::LDRXpost:
1231 Imm = -Imm;
1232 [[fallthrough]];
1233 case AArch64::STRXpre: {
1234 unsigned Reg = RegInfo->getSEHRegNum(MBBI->getOperand(1).getReg());
1235 MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveReg_X))
1236 .addImm(Reg)
1237 .addImm(Imm)
1238 .setMIFlag(Flag);
1239 break;
1240 }
1241 case AArch64::STPDi:
1242 case AArch64::LDPDi: {
1243 unsigned Reg0 = RegInfo->getSEHRegNum(MBBI->getOperand(0).getReg());
1244 unsigned Reg1 = RegInfo->getSEHRegNum(MBBI->getOperand(1).getReg());
1245 MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFRegP))
1246 .addImm(Reg0)
1247 .addImm(Reg1)
1248 .addImm(Imm * 8)
1249 .setMIFlag(Flag);
1250 break;
1251 }
1252 case AArch64::STPXi:
1253 case AArch64::LDPXi: {
1254 Register Reg0 = MBBI->getOperand(0).getReg();
1255 Register Reg1 = MBBI->getOperand(1).getReg();
1256 if (Reg0 == AArch64::FP && Reg1 == AArch64::LR)
1257 MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFPLR))
1258 .addImm(Imm * 8)
1259 .setMIFlag(Flag);
1260 else
1261 MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveRegP))
1262 .addImm(RegInfo->getSEHRegNum(Reg0))
1263 .addImm(RegInfo->getSEHRegNum(Reg1))
1264 .addImm(Imm * 8)
1265 .setMIFlag(Flag);
1266 break;
1267 }
1268 case AArch64::STRXui:
1269 case AArch64::LDRXui: {
1270 int Reg = RegInfo->getSEHRegNum(MBBI->getOperand(0).getReg());
1271 MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveReg))
1272 .addImm(Reg)
1273 .addImm(Imm * 8)
1274 .setMIFlag(Flag);
1275 break;
1276 }
1277 case AArch64::STRDui:
1278 case AArch64::LDRDui: {
1279 unsigned Reg = RegInfo->getSEHRegNum(MBBI->getOperand(0).getReg());
1280 MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveFReg))
1281 .addImm(Reg)
1282 .addImm(Imm * 8)
1283 .setMIFlag(Flag);
1284 break;
1285 }
1286 case AArch64::STPQi:
1287 case AArch64::LDPQi: {
1288 unsigned Reg0 = RegInfo->getSEHRegNum(MBBI->getOperand(0).getReg());
1289 unsigned Reg1 = RegInfo->getSEHRegNum(MBBI->getOperand(1).getReg());
1290 MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveAnyRegQP))
1291 .addImm(Reg0)
1292 .addImm(Reg1)
1293 .addImm(Imm * 16)
1294 .setMIFlag(Flag);
1295 break;
1296 }
1297 case AArch64::LDPQpost:
1298 Imm = -Imm;
1300 case AArch64::STPQpre: {
1301 unsigned Reg0 = RegInfo->getSEHRegNum(MBBI->getOperand(1).getReg());
1302 unsigned Reg1 = RegInfo->getSEHRegNum(MBBI->getOperand(2).getReg());
1303 MIB = BuildMI(MF, DL, TII.get(AArch64::SEH_SaveAnyRegQPX))
1304 .addImm(Reg0)
1305 .addImm(Reg1)
1306 .addImm(Imm * 16)
1307 .setMIFlag(Flag);
1308 break;
1309 }
1310 }
1311 auto I = MBB->insertAfter(MBBI, MIB);
1312 return I;
1313}
1314
1315// Fix up the SEH opcode associated with the save/restore instruction.
1317 unsigned LocalStackSize) {
1318 MachineOperand *ImmOpnd = nullptr;
1319 unsigned ImmIdx = MBBI->getNumOperands() - 1;
1320 switch (MBBI->getOpcode()) {
1321 default:
1322 llvm_unreachable("Fix the offset in the SEH instruction");
1323 case AArch64::SEH_SaveFPLR:
1324 case AArch64::SEH_SaveRegP:
1325 case AArch64::SEH_SaveReg:
1326 case AArch64::SEH_SaveFRegP:
1327 case AArch64::SEH_SaveFReg:
1328 case AArch64::SEH_SaveAnyRegQP:
1329 case AArch64::SEH_SaveAnyRegQPX:
1330 ImmOpnd = &MBBI->getOperand(ImmIdx);
1331 break;
1332 }
1333 if (ImmOpnd)
1334 ImmOpnd->setImm(ImmOpnd->getImm() + LocalStackSize);
1335}
1336
1337// Convert callee-save register save/restore instruction to do stack pointer
1338// decrement/increment to allocate/deallocate the callee-save stack area by
1339// converting store/load to use pre/post increment version.
1342 const DebugLoc &DL, const TargetInstrInfo *TII, int CSStackSizeInc,
1343 bool NeedsWinCFI, bool *HasWinCFI, bool EmitCFI,
1345 int CFAOffset = 0) {
1346 unsigned NewOpc;
1347 switch (MBBI->getOpcode()) {
1348 default:
1349 llvm_unreachable("Unexpected callee-save save/restore opcode!");
1350 case AArch64::STPXi:
1351 NewOpc = AArch64::STPXpre;
1352 break;
1353 case AArch64::STPDi:
1354 NewOpc = AArch64::STPDpre;
1355 break;
1356 case AArch64::STPQi:
1357 NewOpc = AArch64::STPQpre;
1358 break;
1359 case AArch64::STRXui:
1360 NewOpc = AArch64::STRXpre;
1361 break;
1362 case AArch64::STRDui:
1363 NewOpc = AArch64::STRDpre;
1364 break;
1365 case AArch64::STRQui:
1366 NewOpc = AArch64::STRQpre;
1367 break;
1368 case AArch64::LDPXi:
1369 NewOpc = AArch64::LDPXpost;
1370 break;
1371 case AArch64::LDPDi:
1372 NewOpc = AArch64::LDPDpost;
1373 break;
1374 case AArch64::LDPQi:
1375 NewOpc = AArch64::LDPQpost;
1376 break;
1377 case AArch64::LDRXui:
1378 NewOpc = AArch64::LDRXpost;
1379 break;
1380 case AArch64::LDRDui:
1381 NewOpc = AArch64::LDRDpost;
1382 break;
1383 case AArch64::LDRQui:
1384 NewOpc = AArch64::LDRQpost;
1385 break;
1386 }
1387 // Get rid of the SEH code associated with the old instruction.
1388 if (NeedsWinCFI) {
1389 auto SEH = std::next(MBBI);
1391 SEH->eraseFromParent();
1392 }
1393
1394 TypeSize Scale = TypeSize::getFixed(1), Width = TypeSize::getFixed(0);
1395 int64_t MinOffset, MaxOffset;
1396 bool Success = static_cast<const AArch64InstrInfo *>(TII)->getMemOpInfo(
1397 NewOpc, Scale, Width, MinOffset, MaxOffset);
1398 (void)Success;
1399 assert(Success && "unknown load/store opcode");
1400
1401 // If the first store isn't right where we want SP then we can't fold the
1402 // update in so create a normal arithmetic instruction instead.
1403 MachineFunction &MF = *MBB.getParent();
1404 if (MBBI->getOperand(MBBI->getNumOperands() - 1).getImm() != 0 ||
1405 CSStackSizeInc < MinOffset || CSStackSizeInc > MaxOffset) {
1406 emitFrameOffset(MBB, MBBI, DL, AArch64::SP, AArch64::SP,
1407 StackOffset::getFixed(CSStackSizeInc), TII, FrameFlag,
1408 false, false, nullptr, EmitCFI,
1409 StackOffset::getFixed(CFAOffset));
1410
1411 return std::prev(MBBI);
1412 }
1413
1414 MachineInstrBuilder MIB = BuildMI(MBB, MBBI, DL, TII->get(NewOpc));
1415 MIB.addReg(AArch64::SP, RegState::Define);
1416
1417 // Copy all operands other than the immediate offset.
1418 unsigned OpndIdx = 0;
1419 for (unsigned OpndEnd = MBBI->getNumOperands() - 1; OpndIdx < OpndEnd;
1420 ++OpndIdx)
1421 MIB.add(MBBI->getOperand(OpndIdx));
1422
1423 assert(MBBI->getOperand(OpndIdx).getImm() == 0 &&
1424 "Unexpected immediate offset in first/last callee-save save/restore "
1425 "instruction!");
1426 assert(MBBI->getOperand(OpndIdx - 1).getReg() == AArch64::SP &&
1427 "Unexpected base register in callee-save save/restore instruction!");
1428 assert(CSStackSizeInc % Scale == 0);
1429 MIB.addImm(CSStackSizeInc / (int)Scale);
1430
1431 MIB.setMIFlags(MBBI->getFlags());
1432 MIB.setMemRefs(MBBI->memoperands());
1433
1434 // Generate a new SEH code that corresponds to the new instruction.
1435 if (NeedsWinCFI) {
1436 *HasWinCFI = true;
1437 InsertSEH(*MIB, *TII, FrameFlag);
1438 }
1439
1440 if (EmitCFI) {
1441 unsigned CFIIndex = MF.addFrameInst(
1442 MCCFIInstruction::cfiDefCfaOffset(nullptr, CFAOffset - CSStackSizeInc));
1443 BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
1444 .addCFIIndex(CFIIndex)
1445 .setMIFlags(FrameFlag);
1446 }
1447
1448 return std::prev(MBB.erase(MBBI));
1449}
1450
1451// Fixup callee-save register save/restore instructions to take into account
1452// combined SP bump by adding the local stack size to the stack offsets.
1454 uint64_t LocalStackSize,
1455 bool NeedsWinCFI,
1456 bool *HasWinCFI) {
1458 return;
1459
1460 unsigned Opc = MI.getOpcode();
1461 unsigned Scale;
1462 switch (Opc) {
1463 case AArch64::STPXi:
1464 case AArch64::STRXui:
1465 case AArch64::STPDi:
1466 case AArch64::STRDui:
1467 case AArch64::LDPXi:
1468 case AArch64::LDRXui:
1469 case AArch64::LDPDi:
1470 case AArch64::LDRDui:
1471 Scale = 8;
1472 break;
1473 case AArch64::STPQi:
1474 case AArch64::STRQui:
1475 case AArch64::LDPQi:
1476 case AArch64::LDRQui:
1477 Scale = 16;
1478 break;
1479 default:
1480 llvm_unreachable("Unexpected callee-save save/restore opcode!");
1481 }
1482
1483 unsigned OffsetIdx = MI.getNumExplicitOperands() - 1;
1484 assert(MI.getOperand(OffsetIdx - 1).getReg() == AArch64::SP &&
1485 "Unexpected base register in callee-save save/restore instruction!");
1486 // Last operand is immediate offset that needs fixing.
1487 MachineOperand &OffsetOpnd = MI.getOperand(OffsetIdx);
1488 // All generated opcodes have scaled offsets.
1489 assert(LocalStackSize % Scale == 0);
1490 OffsetOpnd.setImm(OffsetOpnd.getImm() + LocalStackSize / Scale);
1491
1492 if (NeedsWinCFI) {
1493 *HasWinCFI = true;
1494 auto MBBI = std::next(MachineBasicBlock::iterator(MI));
1495 assert(MBBI != MI.getParent()->end() && "Expecting a valid instruction");
1497 "Expecting a SEH instruction");
1498 fixupSEHOpcode(MBBI, LocalStackSize);
1499 }
1500}
1501
1502static bool isTargetWindows(const MachineFunction &MF) {
1504}
1505
1506// Convenience function to determine whether I is an SVE callee save.
1508 switch (I->getOpcode()) {
1509 default:
1510 return false;
1511 case AArch64::STR_ZXI:
1512 case AArch64::STR_PXI:
1513 case AArch64::LDR_ZXI:
1514 case AArch64::LDR_PXI:
1515 return I->getFlag(MachineInstr::FrameSetup) ||
1516 I->getFlag(MachineInstr::FrameDestroy);
1517 }
1518}
1519
1521 MachineFunction &MF,
1524 const DebugLoc &DL, bool NeedsWinCFI,
1525 bool NeedsUnwindInfo) {
1526 // Shadow call stack prolog: str x30, [x18], #8
1527 BuildMI(MBB, MBBI, DL, TII.get(AArch64::STRXpost))
1528 .addReg(AArch64::X18, RegState::Define)
1529 .addReg(AArch64::LR)
1530 .addReg(AArch64::X18)
1531 .addImm(8)
1533
1534 // This instruction also makes x18 live-in to the entry block.
1535 MBB.addLiveIn(AArch64::X18);
1536
1537 if (NeedsWinCFI)
1538 BuildMI(MBB, MBBI, DL, TII.get(AArch64::SEH_Nop))
1540
1541 if (NeedsUnwindInfo) {
1542 // Emit a CFI instruction that causes 8 to be subtracted from the value of
1543 // x18 when unwinding past this frame.
1544 static const char CFIInst[] = {
1545 dwarf::DW_CFA_val_expression,
1546 18, // register
1547 2, // length
1548 static_cast<char>(unsigned(dwarf::DW_OP_breg18)),
1549 static_cast<char>(-8) & 0x7f, // addend (sleb128)
1550 };
1551 unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::createEscape(
1552 nullptr, StringRef(CFIInst, sizeof(CFIInst))));
1553 BuildMI(MBB, MBBI, DL, TII.get(AArch64::CFI_INSTRUCTION))
1554 .addCFIIndex(CFIIndex)
1556 }
1557}
1558
1560 MachineFunction &MF,
1563 const DebugLoc &DL) {
1564 // Shadow call stack epilog: ldr x30, [x18, #-8]!
1565 BuildMI(MBB, MBBI, DL, TII.get(AArch64::LDRXpre))
1566 .addReg(AArch64::X18, RegState::Define)
1567 .addReg(AArch64::LR, RegState::Define)
1568 .addReg(AArch64::X18)
1569 .addImm(-8)
1571
1573 unsigned CFIIndex =
1575 BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::CFI_INSTRUCTION))
1576 .addCFIIndex(CFIIndex)
1578 }
1579}
1580
1581// Define the current CFA rule to use the provided FP.
1584 const DebugLoc &DL, unsigned FixedObject) {
1587 const TargetInstrInfo *TII = STI.getInstrInfo();
1589
1590 const int OffsetToFirstCalleeSaveFromFP =
1593 Register FramePtr = TRI->getFrameRegister(MF);
1594 unsigned Reg = TRI->getDwarfRegNum(FramePtr, true);
1595 unsigned CFIIndex = MF.addFrameInst(MCCFIInstruction::cfiDefCfa(
1596 nullptr, Reg, FixedObject - OffsetToFirstCalleeSaveFromFP));
1597 BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
1598 .addCFIIndex(CFIIndex)
1600}
1601
1602#ifndef NDEBUG
1603/// Collect live registers from the end of \p MI's parent up to (including) \p
1604/// MI in \p LiveRegs.
1606 LivePhysRegs &LiveRegs) {
1607
1608 MachineBasicBlock &MBB = *MI.getParent();
1609 LiveRegs.addLiveOuts(MBB);
1610 for (const MachineInstr &MI :
1611 reverse(make_range(MI.getIterator(), MBB.instr_end())))
1612 LiveRegs.stepBackward(MI);
1613}
1614#endif
1615
1617 MachineBasicBlock &MBB) const {
1619 const MachineFrameInfo &MFI = MF.getFrameInfo();
1620 const Function &F = MF.getFunction();
1621 const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
1622 const AArch64RegisterInfo *RegInfo = Subtarget.getRegisterInfo();
1623 const TargetInstrInfo *TII = Subtarget.getInstrInfo();
1624
1625 MachineModuleInfo &MMI = MF.getMMI();
1627 bool EmitCFI = AFI->needsDwarfUnwindInfo(MF);
1628 bool EmitAsyncCFI = AFI->needsAsyncDwarfUnwindInfo(MF);
1629 bool HasFP = hasFP(MF);
1630 bool NeedsWinCFI = needsWinCFI(MF);
1631 bool HasWinCFI = false;
1632 auto Cleanup = make_scope_exit([&]() { MF.setHasWinCFI(HasWinCFI); });
1633
1635#ifndef NDEBUG
1637 // Collect live register from the end of MBB up to the start of the existing
1638 // frame setup instructions.
1639 MachineBasicBlock::iterator NonFrameStart = MBB.begin();
1640 while (NonFrameStart != End &&
1641 NonFrameStart->getFlag(MachineInstr::FrameSetup))
1642 ++NonFrameStart;
1643
1644 LivePhysRegs LiveRegs(*TRI);
1645 if (NonFrameStart != MBB.end()) {
1646 getLivePhysRegsUpTo(*NonFrameStart, *TRI, LiveRegs);
1647 // Ignore registers used for stack management for now.
1648 LiveRegs.removeReg(AArch64::SP);
1649 LiveRegs.removeReg(AArch64::X19);
1650 LiveRegs.removeReg(AArch64::FP);
1651 LiveRegs.removeReg(AArch64::LR);
1652 }
1653
1654 auto VerifyClobberOnExit = make_scope_exit([&]() {
1655 if (NonFrameStart == MBB.end())
1656 return;
1657 // Check if any of the newly instructions clobber any of the live registers.
1658 for (MachineInstr &MI :
1659 make_range(MBB.instr_begin(), NonFrameStart->getIterator())) {
1660 for (auto &Op : MI.operands())
1661 if (Op.isReg() && Op.isDef())
1662 assert(!LiveRegs.contains(Op.getReg()) &&
1663 "live register clobbered by inserted prologue instructions");
1664 }
1665 });
1666#endif
1667
1668 bool IsFunclet = MBB.isEHFuncletEntry();
1669
1670 // At this point, we're going to decide whether or not the function uses a
1671 // redzone. In most cases, the function doesn't have a redzone so let's
1672 // assume that's false and set it to true in the case that there's a redzone.
1673 AFI->setHasRedZone(false);
1674
1675 // Debug location must be unknown since the first debug location is used
1676 // to determine the end of the prologue.
1677 DebugLoc DL;
1678
1679 const auto &MFnI = *MF.getInfo<AArch64FunctionInfo>();
1680 if (MFnI.needsShadowCallStackPrologueEpilogue(MF))
1681 emitShadowCallStackPrologue(*TII, MF, MBB, MBBI, DL, NeedsWinCFI,
1682 MFnI.needsDwarfUnwindInfo(MF));
1683
1684 if (MFnI.shouldSignReturnAddress(MF)) {
1685 BuildMI(MBB, MBBI, DL, TII->get(AArch64::PAUTH_PROLOGUE))
1687 if (NeedsWinCFI)
1688 HasWinCFI = true; // AArch64PointerAuth pass will insert SEH_PACSignLR
1689 }
1690
1691 if (EmitCFI && MFnI.isMTETagged()) {
1692 BuildMI(MBB, MBBI, DL, TII->get(AArch64::EMITMTETAGGED))
1694 }
1695
1696 // We signal the presence of a Swift extended frame to external tools by
1697 // storing FP with 0b0001 in bits 63:60. In normal userland operation a simple
1698 // ORR is sufficient, it is assumed a Swift kernel would initialize the TBI
1699 // bits so that is still true.
1700 if (HasFP && AFI->hasSwiftAsyncContext()) {
1703 if (Subtarget.swiftAsyncContextIsDynamicallySet()) {
1704 // The special symbol below is absolute and has a *value* that can be
1705 // combined with the frame pointer to signal an extended frame.
1706 BuildMI(MBB, MBBI, DL, TII->get(AArch64::LOADgot), AArch64::X16)
1707 .addExternalSymbol("swift_async_extendedFramePointerFlags",
1709 if (NeedsWinCFI) {
1710 BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1712 HasWinCFI = true;
1713 }
1714 BuildMI(MBB, MBBI, DL, TII->get(AArch64::ORRXrs), AArch64::FP)
1715 .addUse(AArch64::FP)
1716 .addUse(AArch64::X16)
1717 .addImm(Subtarget.isTargetILP32() ? 32 : 0);
1718 if (NeedsWinCFI) {
1719 BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1721 HasWinCFI = true;
1722 }
1723 break;
1724 }
1725 [[fallthrough]];
1726
1728 // ORR x29, x29, #0x1000_0000_0000_0000
1729 BuildMI(MBB, MBBI, DL, TII->get(AArch64::ORRXri), AArch64::FP)
1730 .addUse(AArch64::FP)
1731 .addImm(0x1100)
1733 if (NeedsWinCFI) {
1734 BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1736 HasWinCFI = true;
1737 }
1738 break;
1739
1741 break;
1742 }
1743 }
1744
1745 // All calls are tail calls in GHC calling conv, and functions have no
1746 // prologue/epilogue.
1748 return;
1749
1750 // Set tagged base pointer to the requested stack slot.
1751 // Ideally it should match SP value after prologue.
1752 std::optional<int> TBPI = AFI->getTaggedBasePointerIndex();
1753 if (TBPI)
1755 else
1757
1758 const StackOffset &SVEStackSize = getSVEStackSize(MF);
1759
1760 // getStackSize() includes all the locals in its size calculation. We don't
1761 // include these locals when computing the stack size of a funclet, as they
1762 // are allocated in the parent's stack frame and accessed via the frame
1763 // pointer from the funclet. We only save the callee saved registers in the
1764 // funclet, which are really the callee saved registers of the parent
1765 // function, including the funclet.
1766 int64_t NumBytes = IsFunclet ? getWinEHFuncletFrameSize(MF)
1767 : MFI.getStackSize();
1768 if (!AFI->hasStackFrame() && !windowsRequiresStackProbe(MF, NumBytes)) {
1769 assert(!HasFP && "unexpected function without stack frame but with FP");
1770 assert(!SVEStackSize &&
1771 "unexpected function without stack frame but with SVE objects");
1772 // All of the stack allocation is for locals.
1773 AFI->setLocalStackSize(NumBytes);
1774 if (!NumBytes)
1775 return;
1776 // REDZONE: If the stack size is less than 128 bytes, we don't need
1777 // to actually allocate.
1778 if (canUseRedZone(MF)) {
1779 AFI->setHasRedZone(true);
1780 ++NumRedZoneFunctions;
1781 } else {
1782 emitFrameOffset(MBB, MBBI, DL, AArch64::SP, AArch64::SP,
1783 StackOffset::getFixed(-NumBytes), TII,
1784 MachineInstr::FrameSetup, false, NeedsWinCFI, &HasWinCFI);
1785 if (EmitCFI) {
1786 // Label used to tie together the PROLOG_LABEL and the MachineMoves.
1787 MCSymbol *FrameLabel = MMI.getContext().createTempSymbol();
1788 // Encode the stack size of the leaf function.
1789 unsigned CFIIndex = MF.addFrameInst(
1790 MCCFIInstruction::cfiDefCfaOffset(FrameLabel, NumBytes));
1791 BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
1792 .addCFIIndex(CFIIndex)
1794 }
1795 }
1796
1797 if (NeedsWinCFI) {
1798 HasWinCFI = true;
1799 BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_PrologEnd))
1801 }
1802
1803 return;
1804 }
1805
1806 bool IsWin64 =
1808 unsigned FixedObject = getFixedObjectSize(MF, AFI, IsWin64, IsFunclet);
1809
1810 auto PrologueSaveSize = AFI->getCalleeSavedStackSize() + FixedObject;
1811 // All of the remaining stack allocations are for locals.
1812 AFI->setLocalStackSize(NumBytes - PrologueSaveSize);
1813 bool CombineSPBump = shouldCombineCSRLocalStackBump(MF, NumBytes);
1814 bool HomPrologEpilog = homogeneousPrologEpilog(MF);
1815 if (CombineSPBump) {
1816 assert(!SVEStackSize && "Cannot combine SP bump with SVE");
1817 emitFrameOffset(MBB, MBBI, DL, AArch64::SP, AArch64::SP,
1818 StackOffset::getFixed(-NumBytes), TII,
1819 MachineInstr::FrameSetup, false, NeedsWinCFI, &HasWinCFI,
1820 EmitAsyncCFI);
1821 NumBytes = 0;
1822 } else if (HomPrologEpilog) {
1823 // Stack has been already adjusted.
1824 NumBytes -= PrologueSaveSize;
1825 } else if (PrologueSaveSize != 0) {
1827 MBB, MBBI, DL, TII, -PrologueSaveSize, NeedsWinCFI, &HasWinCFI,
1828 EmitAsyncCFI);
1829 NumBytes -= PrologueSaveSize;
1830 }
1831 assert(NumBytes >= 0 && "Negative stack allocation size!?");
1832
1833 // Move past the saves of the callee-saved registers, fixing up the offsets
1834 // and pre-inc if we decided to combine the callee-save and local stack
1835 // pointer bump above.
1836 while (MBBI != End && MBBI->getFlag(MachineInstr::FrameSetup) &&
1838 if (CombineSPBump)
1840 NeedsWinCFI, &HasWinCFI);
1841 ++MBBI;
1842 }
1843
1844 // For funclets the FP belongs to the containing function.
1845 if (!IsFunclet && HasFP) {
1846 // Only set up FP if we actually need to.
1847 int64_t FPOffset = AFI->getCalleeSaveBaseToFrameRecordOffset();
1848
1849 if (CombineSPBump)
1850 FPOffset += AFI->getLocalStackSize();
1851
1852 if (AFI->hasSwiftAsyncContext()) {
1853 // Before we update the live FP we have to ensure there's a valid (or
1854 // null) asynchronous context in its slot just before FP in the frame
1855 // record, so store it now.
1856 const auto &Attrs = MF.getFunction().getAttributes();
1857 bool HaveInitialContext = Attrs.hasAttrSomewhere(Attribute::SwiftAsync);
1858 if (HaveInitialContext)
1859 MBB.addLiveIn(AArch64::X22);
1860 Register Reg = HaveInitialContext ? AArch64::X22 : AArch64::XZR;
1861 BuildMI(MBB, MBBI, DL, TII->get(AArch64::StoreSwiftAsyncContext))
1862 .addUse(Reg)
1863 .addUse(AArch64::SP)
1864 .addImm(FPOffset - 8)
1866 if (NeedsWinCFI) {
1867 // WinCFI and arm64e, where StoreSwiftAsyncContext is expanded
1868 // to multiple instructions, should be mutually-exclusive.
1869 assert(Subtarget.getTargetTriple().getArchName() != "arm64e");
1870 BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1872 HasWinCFI = true;
1873 }
1874 }
1875
1876 if (HomPrologEpilog) {
1877 auto Prolog = MBBI;
1878 --Prolog;
1879 assert(Prolog->getOpcode() == AArch64::HOM_Prolog);
1880 Prolog->addOperand(MachineOperand::CreateImm(FPOffset));
1881 } else {
1882 // Issue sub fp, sp, FPOffset or
1883 // mov fp,sp when FPOffset is zero.
1884 // Note: All stores of callee-saved registers are marked as "FrameSetup".
1885 // This code marks the instruction(s) that set the FP also.
1886 emitFrameOffset(MBB, MBBI, DL, AArch64::FP, AArch64::SP,
1887 StackOffset::getFixed(FPOffset), TII,
1888 MachineInstr::FrameSetup, false, NeedsWinCFI, &HasWinCFI);
1889 if (NeedsWinCFI && HasWinCFI) {
1890 BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_PrologEnd))
1892 // After setting up the FP, the rest of the prolog doesn't need to be
1893 // included in the SEH unwind info.
1894 NeedsWinCFI = false;
1895 }
1896 }
1897 if (EmitAsyncCFI)
1898 emitDefineCFAWithFP(MF, MBB, MBBI, DL, FixedObject);
1899 }
1900
1901 // Now emit the moves for whatever callee saved regs we have (including FP,
1902 // LR if those are saved). Frame instructions for SVE register are emitted
1903 // later, after the instruction which actually save SVE regs.
1904 if (EmitAsyncCFI)
1905 emitCalleeSavedGPRLocations(MBB, MBBI);
1906
1907 // Alignment is required for the parent frame, not the funclet
1908 const bool NeedsRealignment =
1909 NumBytes && !IsFunclet && RegInfo->hasStackRealignment(MF);
1910 const int64_t RealignmentPadding =
1911 (NeedsRealignment && MFI.getMaxAlign() > Align(16))
1912 ? MFI.getMaxAlign().value() - 16
1913 : 0;
1914
1915 if (windowsRequiresStackProbe(MF, NumBytes + RealignmentPadding)) {
1916 uint64_t NumWords = (NumBytes + RealignmentPadding) >> 4;
1917 if (NeedsWinCFI) {
1918 HasWinCFI = true;
1919 // alloc_l can hold at most 256MB, so assume that NumBytes doesn't
1920 // exceed this amount. We need to move at most 2^24 - 1 into x15.
1921 // This is at most two instructions, MOVZ follwed by MOVK.
1922 // TODO: Fix to use multiple stack alloc unwind codes for stacks
1923 // exceeding 256MB in size.
1924 if (NumBytes >= (1 << 28))
1925 report_fatal_error("Stack size cannot exceed 256MB for stack "
1926 "unwinding purposes");
1927
1928 uint32_t LowNumWords = NumWords & 0xFFFF;
1929 BuildMI(MBB, MBBI, DL, TII->get(AArch64::MOVZXi), AArch64::X15)
1930 .addImm(LowNumWords)
1933 BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1935 if ((NumWords & 0xFFFF0000) != 0) {
1936 BuildMI(MBB, MBBI, DL, TII->get(AArch64::MOVKXi), AArch64::X15)
1937 .addReg(AArch64::X15)
1938 .addImm((NumWords & 0xFFFF0000) >> 16) // High half
1941 BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1943 }
1944 } else {
1945 BuildMI(MBB, MBBI, DL, TII->get(AArch64::MOVi64imm), AArch64::X15)
1946 .addImm(NumWords)
1948 }
1949
1950 const char* ChkStk = Subtarget.getChkStkName();
1951 switch (MF.getTarget().getCodeModel()) {
1952 case CodeModel::Tiny:
1953 case CodeModel::Small:
1954 case CodeModel::Medium:
1955 case CodeModel::Kernel:
1956 BuildMI(MBB, MBBI, DL, TII->get(AArch64::BL))
1957 .addExternalSymbol(ChkStk)
1958 .addReg(AArch64::X15, RegState::Implicit)
1963 if (NeedsWinCFI) {
1964 HasWinCFI = true;
1965 BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1967 }
1968 break;
1969 case CodeModel::Large:
1970 BuildMI(MBB, MBBI, DL, TII->get(AArch64::MOVaddrEXT))
1971 .addReg(AArch64::X16, RegState::Define)
1972 .addExternalSymbol(ChkStk)
1973 .addExternalSymbol(ChkStk)
1975 if (NeedsWinCFI) {
1976 HasWinCFI = true;
1977 BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1979 }
1980
1981 BuildMI(MBB, MBBI, DL, TII->get(getBLRCallOpcode(MF)))
1982 .addReg(AArch64::X16, RegState::Kill)
1988 if (NeedsWinCFI) {
1989 HasWinCFI = true;
1990 BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
1992 }
1993 break;
1994 }
1995
1996 BuildMI(MBB, MBBI, DL, TII->get(AArch64::SUBXrx64), AArch64::SP)
1997 .addReg(AArch64::SP, RegState::Kill)
1998 .addReg(AArch64::X15, RegState::Kill)
2001 if (NeedsWinCFI) {
2002 HasWinCFI = true;
2003 BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_StackAlloc))
2004 .addImm(NumBytes)
2006 }
2007 NumBytes = 0;
2008
2009 if (RealignmentPadding > 0) {
2010 if (RealignmentPadding >= 4096) {
2011 BuildMI(MBB, MBBI, DL, TII->get(AArch64::MOVi64imm))
2012 .addReg(AArch64::X16, RegState::Define)
2013 .addImm(RealignmentPadding)
2015 BuildMI(MBB, MBBI, DL, TII->get(AArch64::ADDXrx64), AArch64::X15)
2016 .addReg(AArch64::SP)
2017 .addReg(AArch64::X16, RegState::Kill)
2020 } else {
2021 BuildMI(MBB, MBBI, DL, TII->get(AArch64::ADDXri), AArch64::X15)
2022 .addReg(AArch64::SP)
2023 .addImm(RealignmentPadding)
2024 .addImm(0)
2026 }
2027
2028 uint64_t AndMask = ~(MFI.getMaxAlign().value() - 1);
2029 BuildMI(MBB, MBBI, DL, TII->get(AArch64::ANDXri), AArch64::SP)
2030 .addReg(AArch64::X15, RegState::Kill)
2032 AFI->setStackRealigned(true);
2033
2034 // No need for SEH instructions here; if we're realigning the stack,
2035 // we've set a frame pointer and already finished the SEH prologue.
2036 assert(!NeedsWinCFI);
2037 }
2038 }
2039
2040 StackOffset SVECalleeSavesSize = {}, SVELocalsSize = SVEStackSize;
2041 MachineBasicBlock::iterator CalleeSavesBegin = MBBI, CalleeSavesEnd = MBBI;
2042
2043 // Process the SVE callee-saves to determine what space needs to be
2044 // allocated.
2045 if (int64_t CalleeSavedSize = AFI->getSVECalleeSavedStackSize()) {
2046 LLVM_DEBUG(dbgs() << "SVECalleeSavedStackSize = " << CalleeSavedSize
2047 << "\n");
2048 // Find callee save instructions in frame.
2049 CalleeSavesBegin = MBBI;
2050 assert(IsSVECalleeSave(CalleeSavesBegin) && "Unexpected instruction");
2052 ++MBBI;
2053 CalleeSavesEnd = MBBI;
2054
2055 SVECalleeSavesSize = StackOffset::getScalable(CalleeSavedSize);
2056 SVELocalsSize = SVEStackSize - SVECalleeSavesSize;
2057 }
2058
2059 // Allocate space for the callee saves (if any).
2060 StackOffset CFAOffset =
2061 StackOffset::getFixed((int64_t)MFI.getStackSize() - NumBytes);
2062 StackOffset LocalsSize = SVELocalsSize + StackOffset::getFixed(NumBytes);
2063 allocateStackSpace(MBB, CalleeSavesBegin, 0, SVECalleeSavesSize, false,
2064 nullptr, EmitAsyncCFI && !HasFP, CFAOffset,
2065 MFI.hasVarSizedObjects() || LocalsSize);
2066 CFAOffset += SVECalleeSavesSize;
2067
2068 if (EmitAsyncCFI)
2069 emitCalleeSavedSVELocations(MBB, CalleeSavesEnd);
2070
2071 // Allocate space for the rest of the frame including SVE locals. Align the
2072 // stack as necessary.
2073 assert(!(canUseRedZone(MF) && NeedsRealignment) &&
2074 "Cannot use redzone with stack realignment");
2075 if (!canUseRedZone(MF)) {
2076 // FIXME: in the case of dynamic re-alignment, NumBytes doesn't have
2077 // the correct value here, as NumBytes also includes padding bytes,
2078 // which shouldn't be counted here.
2079 allocateStackSpace(MBB, CalleeSavesEnd, RealignmentPadding,
2080 SVELocalsSize + StackOffset::getFixed(NumBytes),
2081 NeedsWinCFI, &HasWinCFI, EmitAsyncCFI && !HasFP,
2082 CFAOffset, MFI.hasVarSizedObjects());
2083 }
2084
2085 // If we need a base pointer, set it up here. It's whatever the value of the
2086 // stack pointer is at this point. Any variable size objects will be allocated
2087 // after this, so we can still use the base pointer to reference locals.
2088 //
2089 // FIXME: Clarify FrameSetup flags here.
2090 // Note: Use emitFrameOffset() like above for FP if the FrameSetup flag is
2091 // needed.
2092 // For funclets the BP belongs to the containing function.
2093 if (!IsFunclet && RegInfo->hasBasePointer(MF)) {
2094 TII->copyPhysReg(MBB, MBBI, DL, RegInfo->getBaseRegister(), AArch64::SP,
2095 false);
2096 if (NeedsWinCFI) {
2097 HasWinCFI = true;
2098 BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
2100 }
2101 }
2102
2103 // The very last FrameSetup instruction indicates the end of prologue. Emit a
2104 // SEH opcode indicating the prologue end.
2105 if (NeedsWinCFI && HasWinCFI) {
2106 BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_PrologEnd))
2108 }
2109
2110 // SEH funclets are passed the frame pointer in X1. If the parent
2111 // function uses the base register, then the base register is used
2112 // directly, and is not retrieved from X1.
2113 if (IsFunclet && F.hasPersonalityFn()) {
2114 EHPersonality Per = classifyEHPersonality(F.getPersonalityFn());
2115 if (isAsynchronousEHPersonality(Per)) {
2116 BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::COPY), AArch64::FP)
2117 .addReg(AArch64::X1)
2119 MBB.addLiveIn(AArch64::X1);
2120 }
2121 }
2122
2123 if (EmitCFI && !EmitAsyncCFI) {
2124 if (HasFP) {
2125 emitDefineCFAWithFP(MF, MBB, MBBI, DL, FixedObject);
2126 } else {
2127 StackOffset TotalSize =
2128 SVEStackSize + StackOffset::getFixed((int64_t)MFI.getStackSize());
2129 unsigned CFIIndex = MF.addFrameInst(createDefCFA(
2130 *RegInfo, /*FrameReg=*/AArch64::SP, /*Reg=*/AArch64::SP, TotalSize,
2131 /*LastAdjustmentWasScalable=*/false));
2132 BuildMI(MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
2133 .addCFIIndex(CFIIndex)
2135 }
2136 emitCalleeSavedGPRLocations(MBB, MBBI);
2137 emitCalleeSavedSVELocations(MBB, MBBI);
2138 }
2139}
2140
2142 switch (MI.getOpcode()) {
2143 default:
2144 return false;
2145 case AArch64::CATCHRET:
2146 case AArch64::CLEANUPRET:
2147 return true;
2148 }
2149}
2150
2152 MachineBasicBlock &MBB) const {
2154 MachineFrameInfo &MFI = MF.getFrameInfo();
2156 const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
2157 const TargetInstrInfo *TII = Subtarget.getInstrInfo();
2158 DebugLoc DL;
2159 bool NeedsWinCFI = needsWinCFI(MF);
2160 bool EmitCFI = AFI->needsAsyncDwarfUnwindInfo(MF);
2161 bool HasWinCFI = false;
2162 bool IsFunclet = false;
2163
2164 if (MBB.end() != MBBI) {
2165 DL = MBBI->getDebugLoc();
2166 IsFunclet = isFuncletReturnInstr(*MBBI);
2167 }
2168
2169 MachineBasicBlock::iterator EpilogStartI = MBB.end();
2170
2171 auto FinishingTouches = make_scope_exit([&]() {
2172 if (AFI->shouldSignReturnAddress(MF)) {
2173 BuildMI(MBB, MBB.getFirstTerminator(), DL,
2174 TII->get(AArch64::PAUTH_EPILOGUE))
2175 .setMIFlag(MachineInstr::FrameDestroy);
2176 if (NeedsWinCFI)
2177 HasWinCFI = true; // AArch64PointerAuth pass will insert SEH_PACSignLR
2178 }
2181 if (EmitCFI)
2182 emitCalleeSavedGPRRestores(MBB, MBB.getFirstTerminator());
2183 if (HasWinCFI) {
2185 TII->get(AArch64::SEH_EpilogEnd))
2187 if (!MF.hasWinCFI())
2188 MF.setHasWinCFI(true);
2189 }
2190 if (NeedsWinCFI) {
2191 assert(EpilogStartI != MBB.end());
2192 if (!HasWinCFI)
2193 MBB.erase(EpilogStartI);
2194 }
2195 });
2196
2197 int64_t NumBytes = IsFunclet ? getWinEHFuncletFrameSize(MF)
2198 : MFI.getStackSize();
2199
2200 // All calls are tail calls in GHC calling conv, and functions have no
2201 // prologue/epilogue.
2203 return;
2204
2205 // How much of the stack used by incoming arguments this function is expected
2206 // to restore in this particular epilogue.
2207 int64_t ArgumentStackToRestore = getArgumentStackToRestore(MF, MBB);
2208 bool IsWin64 =
2209 Subtarget.isCallingConvWin64(MF.getFunction().getCallingConv());
2210 unsigned FixedObject = getFixedObjectSize(MF, AFI, IsWin64, IsFunclet);
2211
2212 int64_t AfterCSRPopSize = ArgumentStackToRestore;
2213 auto PrologueSaveSize = AFI->getCalleeSavedStackSize() + FixedObject;
2214 // We cannot rely on the local stack size set in emitPrologue if the function
2215 // has funclets, as funclets have different local stack size requirements, and
2216 // the current value set in emitPrologue may be that of the containing
2217 // function.
2218 if (MF.hasEHFunclets())
2219 AFI->setLocalStackSize(NumBytes - PrologueSaveSize);
2220 if (homogeneousPrologEpilog(MF, &MBB)) {
2221 assert(!NeedsWinCFI);
2222 auto LastPopI = MBB.getFirstTerminator();
2223 if (LastPopI != MBB.begin()) {
2224 auto HomogeneousEpilog = std::prev(LastPopI);
2225 if (HomogeneousEpilog->getOpcode() == AArch64::HOM_Epilog)
2226 LastPopI = HomogeneousEpilog;
2227 }
2228
2229 // Adjust local stack
2230 emitFrameOffset(MBB, LastPopI, DL, AArch64::SP, AArch64::SP,
2232 MachineInstr::FrameDestroy, false, NeedsWinCFI, &HasWinCFI);
2233
2234 // SP has been already adjusted while restoring callee save regs.
2235 // We've bailed-out the case with adjusting SP for arguments.
2236 assert(AfterCSRPopSize == 0);
2237 return;
2238 }
2239 bool CombineSPBump = shouldCombineCSRLocalStackBumpInEpilogue(MBB, NumBytes);
2240 // Assume we can't combine the last pop with the sp restore.
2241
2242 bool CombineAfterCSRBump = false;
2243 if (!CombineSPBump && PrologueSaveSize != 0) {
2245 while (Pop->getOpcode() == TargetOpcode::CFI_INSTRUCTION ||
2247 Pop = std::prev(Pop);
2248 // Converting the last ldp to a post-index ldp is valid only if the last
2249 // ldp's offset is 0.
2250 const MachineOperand &OffsetOp = Pop->getOperand(Pop->getNumOperands() - 1);
2251 // If the offset is 0 and the AfterCSR pop is not actually trying to
2252 // allocate more stack for arguments (in space that an untimely interrupt
2253 // may clobber), convert it to a post-index ldp.
2254 if (OffsetOp.getImm() == 0 && AfterCSRPopSize >= 0) {
2256 MBB, Pop, DL, TII, PrologueSaveSize, NeedsWinCFI, &HasWinCFI, EmitCFI,
2257 MachineInstr::FrameDestroy, PrologueSaveSize);
2258 } else {
2259 // If not, make sure to emit an add after the last ldp.
2260 // We're doing this by transfering the size to be restored from the
2261 // adjustment *before* the CSR pops to the adjustment *after* the CSR
2262 // pops.
2263 AfterCSRPopSize += PrologueSaveSize;
2264 CombineAfterCSRBump = true;
2265 }
2266 }
2267
2268 // Move past the restores of the callee-saved registers.
2269 // If we plan on combining the sp bump of the local stack size and the callee
2270 // save stack size, we might need to adjust the CSR save and restore offsets.
2273 while (LastPopI != Begin) {
2274 --LastPopI;
2275 if (!LastPopI->getFlag(MachineInstr::FrameDestroy) ||
2276 IsSVECalleeSave(LastPopI)) {
2277 ++LastPopI;
2278 break;
2279 } else if (CombineSPBump)
2281 NeedsWinCFI, &HasWinCFI);
2282 }
2283
2284 if (NeedsWinCFI) {
2285 // Note that there are cases where we insert SEH opcodes in the
2286 // epilogue when we had no SEH opcodes in the prologue. For
2287 // example, when there is no stack frame but there are stack
2288 // arguments. Insert the SEH_EpilogStart and remove it later if it
2289 // we didn't emit any SEH opcodes to avoid generating WinCFI for
2290 // functions that don't need it.
2291 BuildMI(MBB, LastPopI, DL, TII->get(AArch64::SEH_EpilogStart))
2293 EpilogStartI = LastPopI;
2294 --EpilogStartI;
2295 }
2296
2297 if (hasFP(MF) && AFI->hasSwiftAsyncContext()) {
2300 // Avoid the reload as it is GOT relative, and instead fall back to the
2301 // hardcoded value below. This allows a mismatch between the OS and
2302 // application without immediately terminating on the difference.
2303 [[fallthrough]];
2305 // We need to reset FP to its untagged state on return. Bit 60 is
2306 // currently used to show the presence of an extended frame.
2307
2308 // BIC x29, x29, #0x1000_0000_0000_0000
2309 BuildMI(MBB, MBB.getFirstTerminator(), DL, TII->get(AArch64::ANDXri),
2310 AArch64::FP)
2311 .addUse(AArch64::FP)
2312 .addImm(0x10fe)
2314 if (NeedsWinCFI) {
2315 BuildMI(MBB, MBBI, DL, TII->get(AArch64::SEH_Nop))
2317 HasWinCFI = true;
2318 }
2319 break;
2320
2322 break;
2323 }
2324 }
2325
2326 const StackOffset &SVEStackSize = getSVEStackSize(MF);
2327
2328 // If there is a single SP update, insert it before the ret and we're done.
2329 if (CombineSPBump) {
2330 assert(!SVEStackSize && "Cannot combine SP bump with SVE");
2331
2332 // When we are about to restore the CSRs, the CFA register is SP again.
2333 if (EmitCFI && hasFP(MF)) {
2334 const AArch64RegisterInfo &RegInfo = *Subtarget.getRegisterInfo();
2335 unsigned Reg = RegInfo.getDwarfRegNum(AArch64::SP, true);
2336 unsigned CFIIndex =
2337 MF.addFrameInst(MCCFIInstruction::cfiDefCfa(nullptr, Reg, NumBytes));
2338 BuildMI(MBB, LastPopI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
2339 .addCFIIndex(CFIIndex)
2341 }
2342
2343 emitFrameOffset(MBB, MBB.getFirstTerminator(), DL, AArch64::SP, AArch64::SP,
2344 StackOffset::getFixed(NumBytes + (int64_t)AfterCSRPopSize),
2345 TII, MachineInstr::FrameDestroy, false, NeedsWinCFI,
2346 &HasWinCFI, EmitCFI, StackOffset::getFixed(NumBytes));
2347 return;
2348 }
2349
2350 NumBytes -= PrologueSaveSize;
2351 assert(NumBytes >= 0 && "Negative stack allocation size!?");
2352
2353 // Process the SVE callee-saves to determine what space needs to be
2354 // deallocated.
2355 StackOffset DeallocateBefore = {}, DeallocateAfter = SVEStackSize;
2356 MachineBasicBlock::iterator RestoreBegin = LastPopI, RestoreEnd = LastPopI;
2357 if (int64_t CalleeSavedSize = AFI->getSVECalleeSavedStackSize()) {
2358 RestoreBegin = std::prev(RestoreEnd);
2359 while (RestoreBegin != MBB.begin() &&
2360 IsSVECalleeSave(std::prev(RestoreBegin)))
2361 --RestoreBegin;
2362
2363 assert(IsSVECalleeSave(RestoreBegin) &&
2364 IsSVECalleeSave(std::prev(RestoreEnd)) && "Unexpected instruction");
2365
2366 StackOffset CalleeSavedSizeAsOffset =
2367 StackOffset::getScalable(CalleeSavedSize);
2368 DeallocateBefore = SVEStackSize - CalleeSavedSizeAsOffset;
2369 DeallocateAfter = CalleeSavedSizeAsOffset;
2370 }
2371
2372 // Deallocate the SVE area.
2373 if (SVEStackSize) {
2374 // If we have stack realignment or variable sized objects on the stack,
2375 // restore the stack pointer from the frame pointer prior to SVE CSR
2376 // restoration.
2377 if (AFI->isStackRealigned() || MFI.hasVarSizedObjects()) {
2378 if (int64_t CalleeSavedSize = AFI->getSVECalleeSavedStackSize()) {
2379 // Set SP to start of SVE callee-save area from which they can
2380 // be reloaded. The code below will deallocate the stack space
2381 // space by moving FP -> SP.
2382 emitFrameOffset(MBB, RestoreBegin, DL, AArch64::SP, AArch64::FP,
2383 StackOffset::getScalable(-CalleeSavedSize), TII,
2385 }
2386 } else {
2387 if (AFI->getSVECalleeSavedStackSize()) {
2388 // Deallocate the non-SVE locals first before we can deallocate (and
2389 // restore callee saves) from the SVE area.
2391 MBB, RestoreBegin, DL, AArch64::SP, AArch64::SP,
2393 false, false, nullptr, EmitCFI && !hasFP(MF),
2394 SVEStackSize + StackOffset::getFixed(NumBytes + PrologueSaveSize));
2395 NumBytes = 0;
2396 }
2397
2398 emitFrameOffset(MBB, RestoreBegin, DL, AArch64::SP, AArch64::SP,
2399 DeallocateBefore, TII, MachineInstr::FrameDestroy, false,
2400 false, nullptr, EmitCFI && !hasFP(MF),
2401 SVEStackSize +
2402 StackOffset::getFixed(NumBytes + PrologueSaveSize));
2403
2404 emitFrameOffset(MBB, RestoreEnd, DL, AArch64::SP, AArch64::SP,
2405 DeallocateAfter, TII, MachineInstr::FrameDestroy, false,
2406 false, nullptr, EmitCFI && !hasFP(MF),
2407 DeallocateAfter +
2408 StackOffset::getFixed(NumBytes + PrologueSaveSize));
2409 }
2410 if (EmitCFI)
2411 emitCalleeSavedSVERestores(MBB, RestoreEnd);
2412 }
2413
2414 if (!hasFP(MF)) {
2415 bool RedZone = canUseRedZone(MF);
2416 // If this was a redzone leaf function, we don't need to restore the
2417 // stack pointer (but we may need to pop stack args for fastcc).
2418 if (RedZone && AfterCSRPopSize == 0)
2419 return;
2420
2421 // Pop the local variables off the stack. If there are no callee-saved
2422 // registers, it means we are actually positioned at the terminator and can
2423 // combine stack increment for the locals and the stack increment for
2424 // callee-popped arguments into (possibly) a single instruction and be done.
2425 bool NoCalleeSaveRestore = PrologueSaveSize == 0;
2426 int64_t StackRestoreBytes = RedZone ? 0 : NumBytes;
2427 if (NoCalleeSaveRestore)
2428 StackRestoreBytes += AfterCSRPopSize;
2429
2431 MBB, LastPopI, DL, AArch64::SP, AArch64::SP,
2432 StackOffset::getFixed(StackRestoreBytes), TII,
2433 MachineInstr::FrameDestroy, false, NeedsWinCFI, &HasWinCFI, EmitCFI,
2434 StackOffset::getFixed((RedZone ? 0 : NumBytes) + PrologueSaveSize));
2435
2436 // If we were able to combine the local stack pop with the argument pop,
2437 // then we're done.
2438 if (NoCalleeSaveRestore || AfterCSRPopSize == 0) {
2439 return;
2440 }
2441
2442 NumBytes = 0;
2443 }
2444
2445 // Restore the original stack pointer.
2446 // FIXME: Rather than doing the math here, we should instead just use
2447 // non-post-indexed loads for the restores if we aren't actually going to
2448 // be able to save any instructions.
2449 if (!IsFunclet && (MFI.hasVarSizedObjects() || AFI->isStackRealigned())) {
2451 MBB, LastPopI, DL, AArch64::SP, AArch64::FP,
2453 TII, MachineInstr::FrameDestroy, false, NeedsWinCFI, &HasWinCFI);
2454 } else if (NumBytes)
2455 emitFrameOffset(MBB, LastPopI, DL, AArch64::SP, AArch64::SP,
2456 StackOffset::getFixed(NumBytes), TII,
2457 MachineInstr::FrameDestroy, false, NeedsWinCFI, &HasWinCFI);
2458
2459 // When we are about to restore the CSRs, the CFA register is SP again.
2460 if (EmitCFI && hasFP(MF)) {
2461 const AArch64RegisterInfo &RegInfo = *Subtarget.getRegisterInfo();
2462 unsigned Reg = RegInfo.getDwarfRegNum(AArch64::SP, true);
2463 unsigned CFIIndex = MF.addFrameInst(
2464 MCCFIInstruction::cfiDefCfa(nullptr, Reg, PrologueSaveSize));
2465 BuildMI(MBB, LastPopI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
2466 .addCFIIndex(CFIIndex)
2468 }
2469
2470 // This must be placed after the callee-save restore code because that code
2471 // assumes the SP is at the same location as it was after the callee-save save
2472 // code in the prologue.
2473 if (AfterCSRPopSize) {
2474 assert(AfterCSRPopSize > 0 && "attempting to reallocate arg stack that an "
2475 "interrupt may have clobbered");
2476
2478 MBB, MBB.getFirstTerminator(), DL, AArch64::SP, AArch64::SP,
2480 false, NeedsWinCFI, &HasWinCFI, EmitCFI,
2481 StackOffset::getFixed(CombineAfterCSRBump ? PrologueSaveSize : 0));
2482 }
2483}
2484
2487 MF.getInfo<AArch64FunctionInfo>()->needsAsyncDwarfUnwindInfo(MF);
2488}
2489
2490/// getFrameIndexReference - Provide a base+offset reference to an FI slot for
2491/// debug info. It's the same as what we use for resolving the code-gen
2492/// references for now. FIXME: This can go wrong when references are
2493/// SP-relative and simple call frames aren't used.
2496 Register &FrameReg) const {
2498 MF, FI, FrameReg,
2499 /*PreferFP=*/
2500 MF.getFunction().hasFnAttribute(Attribute::SanitizeHWAddress),
2501 /*ForSimm=*/false);
2502}
2503
2506 int FI) const {
2508}
2509
2511 int64_t ObjectOffset) {
2512 const auto *AFI = MF.getInfo<AArch64FunctionInfo>();
2513 const auto &Subtarget = MF.getSubtarget<AArch64Subtarget>();
2514 bool IsWin64 =
2515 Subtarget.isCallingConvWin64(MF.getFunction().getCallingConv());
2516 unsigned FixedObject =
2517 getFixedObjectSize(MF, AFI, IsWin64, /*IsFunclet=*/false);
2518 int64_t CalleeSaveSize = AFI->getCalleeSavedStackSize(MF.getFrameInfo());
2519 int64_t FPAdjust =
2520 CalleeSaveSize - AFI->getCalleeSaveBaseToFrameRecordOffset();
2521 return StackOffset::getFixed(ObjectOffset + FixedObject + FPAdjust);
2522}
2523
2525 int64_t ObjectOffset) {
2526 const auto &MFI = MF.getFrameInfo();
2527 return StackOffset::getFixed(ObjectOffset + (int64_t)MFI.getStackSize());
2528}
2529
2530 // TODO: This function currently does not work for scalable vectors.
2532 int FI) const {
2533 const auto *RegInfo = static_cast<const AArch64RegisterInfo *>(
2535 int ObjectOffset = MF.getFrameInfo().getObjectOffset(FI);
2536 return RegInfo->getLocalAddressRegister(MF) == AArch64::FP
2537 ? getFPOffset(MF, ObjectOffset).getFixed()
2538 : getStackOffset(MF, ObjectOffset).getFixed();
2539}
2540
2542 const MachineFunction &MF, int FI, Register &FrameReg, bool PreferFP,
2543 bool ForSimm) const {
2544 const auto &MFI = MF.getFrameInfo();
2545 int64_t ObjectOffset = MFI.getObjectOffset(FI);
2546 bool isFixed = MFI.isFixedObjectIndex(FI);
2547 bool isSVE = MFI.getStackID(FI) == TargetStackID::ScalableVector;
2548 return resolveFrameOffsetReference(MF, ObjectOffset, isFixed, isSVE, FrameReg,
2549 PreferFP, ForSimm);
2550}
2551
2553 const MachineFunction &MF, int64_t ObjectOffset, bool isFixed, bool isSVE,
2554 Register &FrameReg, bool PreferFP, bool ForSimm) const {
2555 const auto &MFI = MF.getFrameInfo();
2556 const auto *RegInfo = static_cast<const AArch64RegisterInfo *>(
2558 const auto *AFI = MF.getInfo<AArch64FunctionInfo>();
2559 const auto &Subtarget = MF.getSubtarget<AArch64Subtarget>();
2560
2561 int64_t FPOffset = getFPOffset(MF, ObjectOffset).getFixed();
2562 int64_t Offset = getStackOffset(MF, ObjectOffset).getFixed();
2563 bool isCSR =
2564 !isFixed && ObjectOffset >= -((int)AFI->getCalleeSavedStackSize(MFI));
2565
2566 const StackOffset &SVEStackSize = getSVEStackSize(MF);
2567
2568 // Use frame pointer to reference fixed objects. Use it for locals if
2569 // there are VLAs or a dynamically realigned SP (and thus the SP isn't
2570 // reliable as a base). Make sure useFPForScavengingIndex() does the
2571 // right thing for the emergency spill slot.
2572 bool UseFP = false;
2573 if (AFI->hasStackFrame() && !isSVE) {
2574 // We shouldn't prefer using the FP to access fixed-sized stack objects when
2575 // there are scalable (SVE) objects in between the FP and the fixed-sized
2576 // objects.
2577 PreferFP &= !SVEStackSize;
2578
2579 // Note: Keeping the following as multiple 'if' statements rather than
2580 // merging to a single expression for readability.
2581 //
2582 // Argument access should always use the FP.
2583 if (isFixed) {
2584 UseFP = hasFP(MF);
2585 } else if (isCSR && RegInfo->hasStackRealignment(MF)) {
2586 // References to the CSR area must use FP if we're re-aligning the stack
2587 // since the dynamically-sized alignment padding is between the SP/BP and
2588 // the CSR area.
2589 assert(hasFP(MF) && "Re-aligned stack must have frame pointer");
2590 UseFP = true;
2591 } else if (hasFP(MF) && !RegInfo->hasStackRealignment(MF)) {
2592 // If the FPOffset is negative and we're producing a signed immediate, we
2593 // have to keep in mind that the available offset range for negative
2594 // offsets is smaller than for positive ones. If an offset is available
2595 // via the FP and the SP, use whichever is closest.
2596 bool FPOffsetFits = !ForSimm || FPOffset >= -256;
2597 PreferFP |= Offset > -FPOffset && !SVEStackSize;
2598
2599 if (MFI.hasVarSizedObjects()) {
2600 // If we have variable sized objects, we can use either FP or BP, as the
2601 // SP offset is unknown. We can use the base pointer if we have one and
2602 // FP is not preferred. If not, we're stuck with using FP.
2603 bool CanUseBP = RegInfo->hasBasePointer(MF);
2604 if (FPOffsetFits && CanUseBP) // Both are ok. Pick the best.
2605 UseFP = PreferFP;
2606 else if (!CanUseBP) // Can't use BP. Forced to use FP.
2607 UseFP = true;
2608 // else we can use BP and FP, but the offset from FP won't fit.
2609 // That will make us scavenge registers which we can probably avoid by
2610 // using BP. If it won't fit for BP either, we'll scavenge anyway.
2611 } else if (FPOffset >= 0) {
2612 // Use SP or FP, whichever gives us the best chance of the offset
2613 // being in range for direct access. If the FPOffset is positive,
2614 // that'll always be best, as the SP will be even further away.
2615 UseFP = true;
2616 } else if (MF.hasEHFunclets() && !RegInfo->hasBasePointer(MF)) {
2617 // Funclets access the locals contained in the parent's stack frame
2618 // via the frame pointer, so we have to use the FP in the parent
2619 // function.
2620 (void) Subtarget;
2621 assert(
2622 Subtarget.isCallingConvWin64(MF.getFunction().getCallingConv()) &&
2623 "Funclets should only be present on Win64");
2624 UseFP = true;
2625 } else {
2626 // We have the choice between FP and (SP or BP).
2627 if (FPOffsetFits && PreferFP) // If FP is the best fit, use it.
2628 UseFP = true;
2629 }
2630 }
2631 }
2632
2633 assert(
2634 ((isFixed || isCSR) || !RegInfo->hasStackRealignment(MF) || !UseFP) &&
2635 "In the presence of dynamic stack pointer realignment, "
2636 "non-argument/CSR objects cannot be accessed through the frame pointer");
2637
2638 if (isSVE) {
2639 StackOffset FPOffset =
2641 StackOffset SPOffset =
2642 SVEStackSize +
2643 StackOffset::get(MFI.getStackSize() - AFI->getCalleeSavedStackSize(),
2644 ObjectOffset);
2645 // Always use the FP for SVE spills if available and beneficial.
2646 if (hasFP(MF) && (SPOffset.getFixed() ||
2647 FPOffset.getScalable() < SPOffset.getScalable() ||
2648 RegInfo->hasStackRealignment(MF))) {
2649 FrameReg = RegInfo->getFrameRegister(MF);
2650 return FPOffset;
2651 }
2652
2653 FrameReg = RegInfo->hasBasePointer(MF) ? RegInfo->getBaseRegister()
2654 : (unsigned)AArch64::SP;
2655 return SPOffset;
2656 }
2657
2658 StackOffset ScalableOffset = {};
2659 if (UseFP && !(isFixed || isCSR))
2660 ScalableOffset = -SVEStackSize;
2661 if (!UseFP && (isFixed || isCSR))
2662 ScalableOffset = SVEStackSize;
2663
2664 if (UseFP) {
2665 FrameReg = RegInfo->getFrameRegister(MF);
2666 return StackOffset::getFixed(FPOffset) + ScalableOffset;
2667 }
2668
2669 // Use the base pointer if we have one.
2670 if (RegInfo->hasBasePointer(MF))
2671 FrameReg = RegInfo->getBaseRegister();
2672 else {
2673 assert(!MFI.hasVarSizedObjects() &&
2674 "Can't use SP when we have var sized objects.");
2675 FrameReg = AArch64::SP;
2676 // If we're using the red zone for this function, the SP won't actually
2677 // be adjusted, so the offsets will be negative. They're also all
2678 // within range of the signed 9-bit immediate instructions.
2679 if (canUseRedZone(MF))
2680 Offset -= AFI->getLocalStackSize();
2681 }
2682
2683 return StackOffset::getFixed(Offset) + ScalableOffset;
2684}
2685
2686static unsigned getPrologueDeath(MachineFunction &MF, unsigned Reg) {
2687 // Do not set a kill flag on values that are also marked as live-in. This
2688 // happens with the @llvm-returnaddress intrinsic and with arguments passed in
2689 // callee saved registers.
2690 // Omitting the kill flags is conservatively correct even if the live-in
2691 // is not used after all.
2692 bool IsLiveIn = MF.getRegInfo().isLiveIn(Reg);
2693 return getKillRegState(!IsLiveIn);
2694}
2695
2697 const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
2699 return Subtarget.isTargetMachO() &&
2700 !(Subtarget.getTargetLowering()->supportSwiftError() &&
2701 Attrs.hasAttrSomewhere(Attribute::SwiftError)) &&
2703}
2704
2705static bool invalidateWindowsRegisterPairing(unsigned Reg1, unsigned Reg2,
2706 bool NeedsWinCFI, bool IsFirst,
2707 const TargetRegisterInfo *TRI) {
2708 // If we are generating register pairs for a Windows function that requires
2709 // EH support, then pair consecutive registers only. There are no unwind
2710 // opcodes for saves/restores of non-consectuve register pairs.
2711 // The unwind opcodes are save_regp, save_regp_x, save_fregp, save_frepg_x,
2712 // save_lrpair.
2713 // https://docs.microsoft.com/en-us/cpp/build/arm64-exception-handling
2714
2715 if (Reg2 == AArch64::FP)
2716 return true;
2717 if (!NeedsWinCFI)
2718 return false;
2719 if (TRI->getEncodingValue(Reg2) == TRI->getEncodingValue(Reg1) + 1)
2720 return false;
2721 // If pairing a GPR with LR, the pair can be described by the save_lrpair
2722 // opcode. If this is the first register pair, it would end up with a
2723 // predecrement, but there's no save_lrpair_x opcode, so we can only do this
2724 // if LR is paired with something else than the first register.
2725 // The save_lrpair opcode requires the first register to be an odd one.
2726 if (Reg1 >= AArch64::X19 && Reg1 <= AArch64::X27 &&
2727 (Reg1 - AArch64::X19) % 2 == 0 && Reg2 == AArch64::LR && !IsFirst)
2728 return false;
2729 return true;
2730}
2731
2732/// Returns true if Reg1 and Reg2 cannot be paired using a ldp/stp instruction.
2733/// WindowsCFI requires that only consecutive registers can be paired.
2734/// LR and FP need to be allocated together when the frame needs to save
2735/// the frame-record. This means any other register pairing with LR is invalid.
2736static bool invalidateRegisterPairing(unsigned Reg1, unsigned Reg2,
2737 bool UsesWinAAPCS, bool NeedsWinCFI,
2738 bool NeedsFrameRecord, bool IsFirst,
2739 const TargetRegisterInfo *TRI) {
2740 if (UsesWinAAPCS)
2741 return invalidateWindowsRegisterPairing(Reg1, Reg2, NeedsWinCFI, IsFirst,
2742 TRI);
2743
2744 // If we need to store the frame record, don't pair any register
2745 // with LR other than FP.
2746 if (NeedsFrameRecord)
2747 return Reg2 == AArch64::LR;
2748
2749 return false;
2750}
2751
2752namespace {
2753
2754struct RegPairInfo {
2755 unsigned Reg1 = AArch64::NoRegister;
2756 unsigned Reg2 = AArch64::NoRegister;
2757 int FrameIdx;
2758 int Offset;
2759 enum RegType { GPR, FPR64, FPR128, PPR, ZPR } Type;
2760
2761 RegPairInfo() = default;
2762
2763 bool isPaired() const { return Reg2 != AArch64::NoRegister; }
2764
2765 unsigned getScale() const {
2766 switch (Type) {
2767 case PPR:
2768 return 2;
2769 case GPR:
2770 case FPR64:
2771 return 8;
2772 case ZPR:
2773 case FPR128:
2774 return 16;
2775 }
2776 llvm_unreachable("Unsupported type");
2777 }
2778
2779 bool isScalable() const { return Type == PPR || Type == ZPR; }
2780};
2781
2782} // end anonymous namespace
2783
2787 bool NeedsFrameRecord) {
2788
2789 if (CSI.empty())
2790 return;
2791
2792 bool IsWindows = isTargetWindows(MF);
2793 bool NeedsWinCFI = needsWinCFI(MF);
2795 MachineFrameInfo &MFI = MF.getFrameInfo();
2797 unsigned Count = CSI.size();
2798 (void)CC;
2799 // MachO's compact unwind format relies on all registers being stored in
2800 // pairs.
2803 CC == CallingConv::Win64 || (Count & 1) == 0) &&
2804 "Odd number of callee-saved regs to spill!");
2805 int ByteOffset = AFI->getCalleeSavedStackSize();
2806 int StackFillDir = -1;
2807 int RegInc = 1;
2808 unsigned FirstReg = 0;
2809 if (NeedsWinCFI) {
2810 // For WinCFI, fill the stack from the bottom up.
2811 ByteOffset = 0;
2812 StackFillDir = 1;
2813 // As the CSI array is reversed to match PrologEpilogInserter, iterate
2814 // backwards, to pair up registers starting from lower numbered registers.
2815 RegInc = -1;
2816 FirstReg = Count - 1;
2817 }
2818 int ScalableByteOffset = AFI->getSVECalleeSavedStackSize();
2819 bool NeedGapToAlignStack = AFI->hasCalleeSaveStackFreeSpace();
2820
2821 // When iterating backwards, the loop condition relies on unsigned wraparound.
2822 for (unsigned i = FirstReg; i < Count; i += RegInc) {
2823 RegPairInfo RPI;
2824 RPI.Reg1 = CSI[i].getReg();
2825
2826 if (AArch64::GPR64RegClass.contains(RPI.Reg1))
2827 RPI.Type = RegPairInfo::GPR;
2828 else if (AArch64::FPR64RegClass.contains(RPI.Reg1))
2829 RPI.Type = RegPairInfo::FPR64;
2830 else if (AArch64::FPR128RegClass.contains(RPI.Reg1))
2831 RPI.Type = RegPairInfo::FPR128;
2832 else if (AArch64::ZPRRegClass.contains(RPI.Reg1))
2833 RPI.Type = RegPairInfo::ZPR;
2834 else if (AArch64::PPRRegClass.contains(RPI.Reg1))
2835 RPI.Type = RegPairInfo::PPR;
2836 else
2837 llvm_unreachable("Unsupported register class.");
2838
2839 // Add the next reg to the pair if it is in the same register class.
2840 if (unsigned(i + RegInc) < Count) {
2841 Register NextReg = CSI[i + RegInc].getReg();
2842 bool IsFirst = i == FirstReg;
2843 switch (RPI.Type) {
2844 case RegPairInfo::GPR:
2845 if (AArch64::GPR64RegClass.contains(NextReg) &&
2846 !invalidateRegisterPairing(RPI.Reg1, NextReg, IsWindows,
2847 NeedsWinCFI, NeedsFrameRecord, IsFirst,
2848 TRI))
2849 RPI.Reg2 = NextReg;
2850 break;
2851 case RegPairInfo::FPR64:
2852 if (AArch64::FPR64RegClass.contains(NextReg) &&
2853 !invalidateWindowsRegisterPairing(RPI.Reg1, NextReg, NeedsWinCFI,
2854 IsFirst, TRI))
2855 RPI.Reg2 = NextReg;
2856 break;
2857 case RegPairInfo::FPR128:
2858 if (AArch64::FPR128RegClass.contains(NextReg))
2859 RPI.Reg2 = NextReg;
2860 break;
2861 case RegPairInfo::PPR:
2862 case RegPairInfo::ZPR:
2863 break;
2864 }
2865 }
2866
2867 // GPRs and FPRs are saved in pairs of 64-bit regs. We expect the CSI
2868 // list to come in sorted by frame index so that we can issue the store
2869 // pair instructions directly. Assert if we see anything otherwise.
2870 //
2871 // The order of the registers in the list is controlled by
2872 // getCalleeSavedRegs(), so they will always be in-order, as well.
2873 assert((!RPI.isPaired() ||
2874 (CSI[i].getFrameIdx() + RegInc == CSI[i + RegInc].getFrameIdx())) &&
2875 "Out of order callee saved regs!");
2876
2877 assert((!RPI.isPaired() || !NeedsFrameRecord || RPI.Reg2 != AArch64::FP ||
2878 RPI.Reg1 == AArch64::LR) &&
2879 "FrameRecord must be allocated together with LR");
2880
2881 // Windows AAPCS has FP and LR reversed.
2882 assert((!RPI.isPaired() || !NeedsFrameRecord || RPI.Reg1 != AArch64::FP ||
2883 RPI.Reg2 == AArch64::LR) &&
2884 "FrameRecord must be allocated together with LR");
2885
2886 // MachO's compact unwind format relies on all registers being stored in
2887 // adjacent register pairs.
2891 (RPI.isPaired() &&
2892 ((RPI.Reg1 == AArch64::LR && RPI.Reg2 == AArch64::FP) ||
2893 RPI.Reg1 + 1 == RPI.Reg2))) &&
2894 "Callee-save registers not saved as adjacent register pair!");
2895
2896 RPI.FrameIdx = CSI[i].getFrameIdx();
2897 if (NeedsWinCFI &&
2898 RPI.isPaired()) // RPI.FrameIdx must be the lower index of the pair
2899 RPI.FrameIdx = CSI[i + RegInc].getFrameIdx();
2900
2901 int Scale = RPI.getScale();
2902
2903 int OffsetPre = RPI.isScalable() ? ScalableByteOffset : ByteOffset;
2904 assert(OffsetPre % Scale == 0);
2905
2906 if (RPI.isScalable())
2907 ScalableByteOffset += StackFillDir * Scale;
2908 else
2909 ByteOffset += StackFillDir * (RPI.isPaired() ? 2 * Scale : Scale);
2910
2911 // Swift's async context is directly before FP, so allocate an extra
2912 // 8 bytes for it.
2913 if (NeedsFrameRecord && AFI->hasSwiftAsyncContext() &&
2914 ((!IsWindows && RPI.Reg2 == AArch64::FP) ||
2915 (IsWindows && RPI.Reg2 == AArch64::LR)))
2916 ByteOffset += StackFillDir * 8;
2917
2918 assert(!(RPI.isScalable() && RPI.isPaired()) &&
2919 "Paired spill/fill instructions don't exist for SVE vectors");
2920
2921 // Round up size of non-pair to pair size if we need to pad the
2922 // callee-save area to ensure 16-byte alignment.
2923 if (NeedGapToAlignStack && !NeedsWinCFI &&
2924 !RPI.isScalable() && RPI.Type != RegPairInfo::FPR128 &&
2925 !RPI.isPaired() && ByteOffset % 16 != 0) {
2926 ByteOffset += 8 * StackFillDir;
2927 assert(MFI.getObjectAlign(RPI.FrameIdx) <= Align(16));
2928 // A stack frame with a gap looks like this, bottom up:
2929 // d9, d8. x21, gap, x20, x19.
2930 // Set extra alignment on the x21 object to create the gap above it.
2931 MFI.setObjectAlignment(RPI.FrameIdx, Align(16));
2932 NeedGapToAlignStack = false;
2933 }
2934
2935 int OffsetPost = RPI.isScalable() ? ScalableByteOffset : ByteOffset;
2936 assert(OffsetPost % Scale == 0);
2937 // If filling top down (default), we want the offset after incrementing it.
2938 // If filling bottom up (WinCFI) we need the original offset.
2939 int Offset = NeedsWinCFI ? OffsetPre : OffsetPost;
2940
2941 // The FP, LR pair goes 8 bytes into our expanded 24-byte slot so that the
2942 // Swift context can directly precede FP.
2943 if (NeedsFrameRecord && AFI->hasSwiftAsyncContext() &&
2944 ((!IsWindows && RPI.Reg2 == AArch64::FP) ||
2945 (IsWindows && RPI.Reg2 == AArch64::LR)))
2946 Offset += 8;
2947 RPI.Offset = Offset / Scale;
2948
2949 assert(((!RPI.isScalable() && RPI.Offset >= -64 && RPI.Offset <= 63) ||
2950 (RPI.isScalable() && RPI.Offset >= -256 && RPI.Offset <= 255)) &&
2951 "Offset out of bounds for LDP/STP immediate");
2952
2953 // Save the offset to frame record so that the FP register can point to the
2954 // innermost frame record (spilled FP and LR registers).
2955 if (NeedsFrameRecord && ((!IsWindows && RPI.Reg1 == AArch64::LR &&
2956 RPI.Reg2 == AArch64::FP) ||
2957 (IsWindows && RPI.Reg1 == AArch64::FP &&
2958 RPI.Reg2 == AArch64::LR)))
2960
2961 RegPairs.push_back(RPI);
2962 if (RPI.isPaired())
2963 i += RegInc;
2964 }
2965 if (NeedsWinCFI) {
2966 // If we need an alignment gap in the stack, align the topmost stack
2967 // object. A stack frame with a gap looks like this, bottom up:
2968 // x19, d8. d9, gap.
2969 // Set extra alignment on the topmost stack object (the first element in
2970 // CSI, which goes top down), to create the gap above it.
2971 if (AFI->hasCalleeSaveStackFreeSpace())
2972 MFI.setObjectAlignment(CSI[0].getFrameIdx(), Align(16));
2973 // We iterated bottom up over the registers; flip RegPairs back to top
2974 // down order.
2975 std::reverse(RegPairs.begin(), RegPairs.end());
2976 }
2977}
2978
2982 MachineFunction &MF = *MBB.getParent();
2984 bool NeedsWinCFI = needsWinCFI(MF);
2985 DebugLoc DL;
2987
2988 computeCalleeSaveRegisterPairs(MF, CSI, TRI, RegPairs, hasFP(MF));
2989
2990 const MachineRegisterInfo &MRI = MF.getRegInfo();
2991 if (homogeneousPrologEpilog(MF)) {
2992 auto MIB = BuildMI(MBB, MI, DL, TII.get(AArch64::HOM_Prolog))
2994
2995 for (auto &RPI : RegPairs) {
2996 MIB.addReg(RPI.Reg1);
2997 MIB.addReg(RPI.Reg2);
2998
2999 // Update register live in.
3000 if (!MRI.isReserved(RPI.Reg1))
3001 MBB.addLiveIn(RPI.Reg1);
3002 if (RPI.isPaired() && !MRI.isReserved(RPI.Reg2))
3003 MBB.addLiveIn(RPI.Reg2);
3004 }
3005 return true;
3006 }
3007 for (const RegPairInfo &RPI : llvm::reverse(RegPairs)) {
3008 unsigned Reg1 = RPI.Reg1;
3009 unsigned Reg2 = RPI.Reg2;
3010 unsigned StrOpc;
3011
3012 // Issue sequence of spills for cs regs. The first spill may be converted
3013 // to a pre-decrement store later by emitPrologue if the callee-save stack
3014 // area allocation can't be combined with the local stack area allocation.
3015 // For example:
3016 // stp x22, x21, [sp, #0] // addImm(+0)
3017 // stp x20, x19, [sp, #16] // addImm(+2)
3018 // stp fp, lr, [sp, #32] // addImm(+4)
3019 // Rationale: This sequence saves uop updates compared to a sequence of
3020 // pre-increment spills like stp xi,xj,[sp,#-16]!
3021 // Note: Similar rationale and sequence for restores in epilog.
3022 unsigned Size;
3023 Align Alignment;
3024 switch (RPI.Type) {
3025 case RegPairInfo::GPR:
3026 StrOpc = RPI.isPaired() ? AArch64::STPXi : AArch64::STRXui;
3027 Size = 8;
3028 Alignment = Align(8);
3029 break;
3030 case RegPairInfo::FPR64:
3031 StrOpc = RPI.isPaired() ? AArch64::STPDi : AArch64::STRDui;
3032 Size = 8;
3033 Alignment = Align(8);
3034 break;
3035 case RegPairInfo::FPR128:
3036 StrOpc = RPI.isPaired() ? AArch64::STPQi : AArch64::STRQui;
3037 Size = 16;
3038 Alignment = Align(16);
3039 break;
3040 case RegPairInfo::ZPR:
3041 StrOpc = AArch64::STR_ZXI;
3042 Size = 16;
3043 Alignment = Align(16);
3044 break;
3045 case RegPairInfo::PPR:
3046 StrOpc = AArch64::STR_PXI;
3047 Size = 2;
3048 Alignment = Align(2);
3049 break;
3050 }
3051 LLVM_DEBUG(dbgs() << "CSR spill: (" << printReg(Reg1, TRI);
3052 if (RPI.isPaired()) dbgs() << ", " << printReg(Reg2, TRI);
3053 dbgs() << ") -> fi#(" << RPI.FrameIdx;
3054 if (RPI.isPaired()) dbgs() << ", " << RPI.FrameIdx + 1;
3055 dbgs() << ")\n");
3056
3057 assert((!NeedsWinCFI || !(Reg1 == AArch64::LR && Reg2 == AArch64::FP)) &&
3058 "Windows unwdinding requires a consecutive (FP,LR) pair");
3059 // Windows unwind codes require consecutive registers if registers are
3060 // paired. Make the switch here, so that the code below will save (x,x+1)
3061 // and not (x+1,x).
3062 unsigned FrameIdxReg1 = RPI.FrameIdx;
3063 unsigned FrameIdxReg2 = RPI.FrameIdx + 1;
3064 if (NeedsWinCFI && RPI.isPaired()) {
3065 std::swap(Reg1, Reg2);
3066 std::swap(FrameIdxReg1, FrameIdxReg2);
3067 }
3068 MachineInstrBuilder MIB = BuildMI(MBB, MI, DL, TII.get(StrOpc));
3069 if (!MRI.isReserved(Reg1))
3070 MBB.addLiveIn(Reg1);
3071 if (RPI.isPaired()) {
3072 if (!MRI.isReserved(Reg2))
3073 MBB.addLiveIn(Reg2);
3074 MIB.addReg(Reg2, getPrologueDeath(MF, Reg2));
3076 MachinePointerInfo::getFixedStack(MF, FrameIdxReg2),
3077 MachineMemOperand::MOStore, Size, Alignment));
3078 }
3079 MIB.addReg(Reg1, getPrologueDeath(MF, Reg1))
3080 .addReg(AArch64::SP)
3081 .addImm(RPI.Offset) // [sp, #offset*scale],
3082 // where factor*scale is implicit
3085 MachinePointerInfo::getFixedStack(MF, FrameIdxReg1),
3086 MachineMemOperand::MOStore, Size, Alignment));
3087 if (NeedsWinCFI)
3089
3090 // Update the StackIDs of the SVE stack slots.
3091 MachineFrameInfo &MFI = MF.getFrameInfo();
3092 if (RPI.Type == RegPairInfo::ZPR || RPI.Type == RegPairInfo::PPR)
3093 MFI.setStackID(RPI.FrameIdx, TargetStackID::ScalableVector);
3094
3095 }
3096 return true;
3097}
3098
3102 MachineFunction &MF = *MBB.getParent();
3104 DebugLoc DL;
3106 bool NeedsWinCFI = needsWinCFI(MF);
3107
3108 if (MBBI != MBB.end())
3109 DL = MBBI->getDebugLoc();
3110
3111 computeCalleeSaveRegisterPairs(MF, CSI, TRI, RegPairs, hasFP(MF));
3112
3113 if (homogeneousPrologEpilog(MF, &MBB)) {
3114 auto MIB = BuildMI(MBB, MBBI, DL, TII.get(AArch64::HOM_Epilog))
3116 for (auto &RPI : RegPairs) {
3117 MIB.addReg(RPI.Reg1, RegState::Define);
3118 MIB.addReg(RPI.Reg2, RegState::Define);
3119 }
3120 return true;
3121 }
3122
3123 // For performance reasons restore SVE register in increasing order
3124 auto IsPPR = [](const RegPairInfo &c) { return c.Type == RegPairInfo::PPR; };
3125 auto PPRBegin = std::find_if(RegPairs.begin(), RegPairs.end(), IsPPR);
3126 auto PPREnd = std::find_if_not(PPRBegin, RegPairs.end(), IsPPR);
3127 std::reverse(PPRBegin, PPREnd);
3128 auto IsZPR = [](const RegPairInfo &c) { return c.Type == RegPairInfo::ZPR; };
3129 auto ZPRBegin = std::find_if(RegPairs.begin(), RegPairs.end(), IsZPR);
3130 auto ZPREnd = std::find_if_not(ZPRBegin, RegPairs.end(), IsZPR);
3131 std::reverse(ZPRBegin, ZPREnd);
3132
3133 for (const RegPairInfo &RPI : RegPairs) {
3134 unsigned Reg1 = RPI.Reg1;
3135 unsigned Reg2 = RPI.Reg2;
3136
3137 // Issue sequence of restores for cs regs. The last restore may be converted
3138 // to a post-increment load later by emitEpilogue if the callee-save stack
3139 // area allocation can't be combined with the local stack area allocation.
3140 // For example:
3141 // ldp fp, lr, [sp, #32] // addImm(+4)
3142 // ldp x20, x19, [sp, #16] // addImm(+2)
3143 // ldp x22, x21, [sp, #0] // addImm(+0)
3144 // Note: see comment in spillCalleeSavedRegisters()
3145 unsigned LdrOpc;
3146 unsigned Size;
3147 Align Alignment;
3148 switch (RPI.Type) {
3149 case RegPairInfo::GPR:
3150 LdrOpc = RPI.isPaired() ? AArch64::LDPXi : AArch64::LDRXui;
3151 Size = 8;
3152 Alignment = Align(8);
3153 break;
3154 case RegPairInfo::FPR64:
3155 LdrOpc = RPI.isPaired() ? AArch64::LDPDi : AArch64::LDRDui;
3156 Size = 8;
3157 Alignment = Align(8);
3158 break;
3159 case RegPairInfo::FPR128:
3160 LdrOpc = RPI.isPaired() ? AArch64::LDPQi : AArch64::LDRQui;
3161 Size = 16;
3162 Alignment = Align(16);
3163 break;
3164 case RegPairInfo::ZPR:
3165 LdrOpc = AArch64::LDR_ZXI;
3166 Size = 16;
3167 Alignment = Align(16);
3168 break;
3169 case RegPairInfo::PPR:
3170 LdrOpc = AArch64::LDR_PXI;
3171 Size = 2;
3172 Alignment = Align(2);
3173 break;
3174 }
3175 LLVM_DEBUG(dbgs() << "CSR restore: (" << printReg(Reg1, TRI);
3176 if (RPI.isPaired()) dbgs() << ", " << printReg(Reg2, TRI);
3177 dbgs() << ") -> fi#(" << RPI.FrameIdx;
3178 if (RPI.isPaired()) dbgs() << ", " << RPI.FrameIdx + 1;
3179 dbgs() << ")\n");
3180
3181 // Windows unwind codes require consecutive registers if registers are
3182 // paired. Make the switch here, so that the code below will save (x,x+1)
3183 // and not (x+1,x).
3184 unsigned FrameIdxReg1 = RPI.FrameIdx;
3185 unsigned FrameIdxReg2 = RPI.FrameIdx + 1;
3186 if (NeedsWinCFI && RPI.isPaired()) {
3187 std::swap(Reg1, Reg2);
3188 std::swap(FrameIdxReg1, FrameIdxReg2);
3189 }
3190 MachineInstrBuilder MIB = BuildMI(MBB, MBBI, DL, TII.get(LdrOpc));
3191 if (RPI.isPaired()) {
3192 MIB.addReg(Reg2, getDefRegState(true));
3194 MachinePointerInfo::getFixedStack(MF, FrameIdxReg2),
3195 MachineMemOperand::MOLoad, Size, Alignment));
3196 }
3197 MIB.addReg(Reg1, getDefRegState(true))
3198 .addReg(AArch64::SP)
3199 .addImm(RPI.Offset) // [sp, #offset*scale]
3200 // where factor*scale is implicit
3203 MachinePointerInfo::getFixedStack(MF, FrameIdxReg1),
3204 MachineMemOperand::MOLoad, Size, Alignment));
3205 if (NeedsWinCFI)
3207 }
3208
3209 return true;
3210}
3211
3213 BitVector &SavedRegs,
3214 RegScavenger *RS) const {
3215 // All calls are tail calls in GHC calling conv, and functions have no
3216 // prologue/epilogue.
3218 return;
3219
3221 const AArch64RegisterInfo *RegInfo = static_cast<const AArch64RegisterInfo *>(
3223 const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
3225 unsigned UnspilledCSGPR = AArch64::NoRegister;
3226 unsigned UnspilledCSGPRPaired = AArch64::NoRegister;
3227
3228 MachineFrameInfo &MFI = MF.getFrameInfo();
3229 const MCPhysReg *CSRegs = MF.getRegInfo().getCalleeSavedRegs();
3230
3231 unsigned BasePointerReg = RegInfo->hasBasePointer(MF)
3232 ? RegInfo->getBaseRegister()
3233 : (unsigned)AArch64::NoRegister;
3234
3235 unsigned ExtraCSSpill = 0;
3236 bool HasUnpairedGPR64 = false;
3237 // Figure out which callee-saved registers to save/restore.
3238 for (unsigned i = 0; CSRegs[i]; ++i) {
3239 const unsigned Reg = CSRegs[i];
3240
3241 // Add the base pointer register to SavedRegs if it is callee-save.
3242 if (Reg == BasePointerReg)
3243 SavedRegs.set(Reg);
3244
3245 bool RegUsed = SavedRegs.test(Reg);
3246 unsigned PairedReg = AArch64::NoRegister;
3247 const bool RegIsGPR64 = AArch64::GPR64RegClass.contains(Reg);
3248 if (RegIsGPR64 || AArch64::FPR64RegClass.contains(Reg) ||
3249 AArch64::FPR128RegClass.contains(Reg)) {
3250 // Compensate for odd numbers of GP CSRs.
3251 // For now, all the known cases of odd number of CSRs are of GPRs.
3252 if (HasUnpairedGPR64)
3253 PairedReg = CSRegs[i % 2 == 0 ? i - 1 : i + 1];
3254 else
3255 PairedReg = CSRegs[i ^ 1];
3256 }
3257
3258 // If the function requires all the GP registers to save (SavedRegs),
3259 // and there are an odd number of GP CSRs at the same time (CSRegs),
3260 // PairedReg could be in a different register class from Reg, which would
3261 // lead to a FPR (usually D8) accidentally being marked saved.
3262 if (RegIsGPR64 && !AArch64::GPR64RegClass.contains(PairedReg)) {
3263 PairedReg = AArch64::NoRegister;
3264 HasUnpairedGPR64 = true;
3265 }
3266 assert(PairedReg == AArch64::NoRegister ||
3267 AArch64::GPR64RegClass.contains(Reg, PairedReg) ||
3268 AArch64::FPR64RegClass.contains(Reg, PairedReg) ||
3269 AArch64::FPR128RegClass.contains(Reg, PairedReg));
3270
3271 if (!RegUsed) {
3272 if (AArch64::GPR64RegClass.contains(Reg) &&
3273 !RegInfo->isReservedReg(MF, Reg)) {
3274 UnspilledCSGPR = Reg;
3275 UnspilledCSGPRPaired = PairedReg;
3276 }
3277 continue;
3278 }
3279
3280 // MachO's compact unwind format relies on all registers being stored in
3281 // pairs.
3282 // FIXME: the usual format is actually better if unwinding isn't needed.
3283 if (producePairRegisters(MF) && PairedReg != AArch64::NoRegister &&
3284 !SavedRegs.test(PairedReg)) {
3285 SavedRegs.set(PairedReg);
3286 if (AArch64::GPR64RegClass.contains(PairedReg) &&
3287 !RegInfo->isReservedReg(MF, PairedReg))
3288 ExtraCSSpill = PairedReg;
3289 }
3290 }
3291
3293 !Subtarget.isTargetWindows()) {
3294 // For Windows calling convention on a non-windows OS, where X18 is treated
3295 // as reserved, back up X18 when entering non-windows code (marked with the
3296 // Windows calling convention) and restore when returning regardless of
3297 // whether the individual function uses it - it might call other functions
3298 // that clobber it.
3299 SavedRegs.set(AArch64::X18);
3300 }
3301
3302 // Calculates the callee saved stack size.
3303 unsigned CSStackSize = 0;
3304 unsigned SVECSStackSize = 0;
3306 const MachineRegisterInfo &MRI = MF.getRegInfo();
3307 for (unsigned Reg : SavedRegs.set_bits()) {
3308 auto RegSize = TRI->getRegSizeInBits(Reg, MRI) / 8;
3309 if (AArch64::PPRRegClass.contains(Reg) ||
3310 AArch64::ZPRRegClass.contains(Reg))
3311 SVECSStackSize += RegSize;
3312 else
3313 CSStackSize += RegSize;
3314 }
3315
3316 // Save number of saved regs, so we can easily update CSStackSize later.
3317 unsigned NumSavedRegs = SavedRegs.count();
3318
3319 // The frame record needs to be created by saving the appropriate registers
3320 uint64_t EstimatedStackSize = MFI.estimateStackSize(MF);
3321 if (hasFP(MF) ||
3322 windowsRequiresStackProbe(MF, EstimatedStackSize + CSStackSize + 16)) {
3323 SavedRegs.set(AArch64::FP);
3324 SavedRegs.set(AArch64::LR);
3325 }
3326
3327 LLVM_DEBUG(dbgs() << "*** determineCalleeSaves\nSaved CSRs:";
3328 for (unsigned Reg
3329 : SavedRegs.set_bits()) dbgs()
3330 << ' ' << printReg(Reg, RegInfo);
3331 dbgs() << "\n";);
3332
3333 // If any callee-saved registers are used, the frame cannot be eliminated.
3334 int64_t SVEStackSize =
3335 alignTo(SVECSStackSize + estimateSVEStackObjectOffsets(MFI), 16);
3336 bool CanEliminateFrame = (SavedRegs.count() == 0) && !SVEStackSize;
3337
3338 // The CSR spill slots have not been allocated yet, so estimateStackSize
3339 // won't include them.
3340 unsigned EstimatedStackSizeLimit = estimateRSStackSizeLimit(MF);
3341
3342 // We may address some of the stack above the canonical frame address, either
3343 // for our own arguments or during a call. Include that in calculating whether
3344 // we have complicated addressing concerns.
3345 int64_t CalleeStackUsed = 0;
3346 for (int I = MFI.getObjectIndexBegin(); I != 0; ++I) {
3347 int64_t FixedOff = MFI.getObjectOffset(I);
3348 if (FixedOff > CalleeStackUsed) CalleeStackUsed = FixedOff;
3349 }
3350
3351 // Conservatively always assume BigStack when there are SVE spills.
3352 bool BigStack = SVEStackSize || (EstimatedStackSize + CSStackSize +
3353 CalleeStackUsed) > EstimatedStackSizeLimit;
3354 if (BigStack || !CanEliminateFrame || RegInfo->cannotEliminateFrame(MF))
3355 AFI->setHasStackFrame(true);
3356
3357 // Estimate if we might need to scavenge a register at some point in order
3358 // to materialize a stack offset. If so, either spill one additional
3359 // callee-saved register or reserve a special spill slot to facilitate
3360 // register scavenging. If we already spilled an extra callee-saved register
3361 // above to keep the number of spills even, we don't need to do anything else
3362 // here.
3363 if (BigStack) {
3364 if (!ExtraCSSpill && UnspilledCSGPR != AArch64::NoRegister) {
3365 LLVM_DEBUG(dbgs() << "Spilling " << printReg(UnspilledCSGPR, RegInfo)
3366 << " to get a scratch register.\n");
3367 SavedRegs.set(UnspilledCSGPR);
3368 ExtraCSSpill = UnspilledCSGPR;
3369
3370 // MachO's compact unwind format relies on all registers being stored in
3371 // pairs, so if we need to spill one extra for BigStack, then we need to
3372 // store the pair.
3373 if (producePairRegisters(MF)) {
3374 if (UnspilledCSGPRPaired == AArch64::NoRegister) {
3375 // Failed to make a pair for compact unwind format, revert spilling.
3376 if (produceCompactUnwindFrame(MF)) {
3377 SavedRegs.reset(UnspilledCSGPR);
3378 ExtraCSSpill = AArch64::NoRegister;
3379 }
3380 } else
3381 SavedRegs.set(UnspilledCSGPRPaired);
3382 }
3383 }
3384
3385 // If we didn't find an extra callee-saved register to spill, create
3386 // an emergency spill slot.
3387 if (!ExtraCSSpill || MF.getRegInfo().isPhysRegUsed(ExtraCSSpill)) {
3389 const TargetRegisterClass &RC = AArch64::GPR64RegClass;
3390 unsigned Size = TRI->getSpillSize(RC);
3391 Align Alignment = TRI->getSpillAlign(RC);
3392 int FI = MFI.CreateStackObject(Size, Alignment, false);
3394 LLVM_DEBUG(dbgs() << "No available CS registers, allocated fi#" << FI
3395 << " as the emergency spill slot.\n");
3396 }
3397 }
3398
3399 // Adding the size of additional 64bit GPR saves.
3400 CSStackSize += 8 * (SavedRegs.count() - NumSavedRegs);
3401
3402 // A Swift asynchronous context extends the frame record with a pointer
3403 // directly before FP.
3404 if (hasFP(MF) && AFI->hasSwiftAsyncContext())
3405 CSStackSize += 8;
3406
3407 uint64_t AlignedCSStackSize = alignTo(CSStackSize, 16);
3408 LLVM_DEBUG(dbgs() << "Estimated stack frame size: "
3409 << EstimatedStackSize + AlignedCSStackSize
3410 << " bytes.\n");
3411
3413 AFI->getCalleeSavedStackSize() == AlignedCSStackSize) &&
3414 "Should not invalidate callee saved info");
3415
3416 // Round up to register pair alignment to avoid additional SP adjustment
3417 // instructions.
3418 AFI->setCalleeSavedStackSize(AlignedCSStackSize);
3419 AFI->setCalleeSaveStackHasFreeSpace(AlignedCSStackSize != CSStackSize);
3420 AFI->setSVECalleeSavedStackSize(alignTo(SVECSStackSize, 16));
3421}
3422
3424 MachineFunction &MF, const TargetRegisterInfo *RegInfo,
3425 std::vector<CalleeSavedInfo> &CSI, unsigned &MinCSFrameIndex,
3426 unsigned &MaxCSFrameIndex) const {
3427 bool NeedsWinCFI = needsWinCFI(MF);
3428 // To match the canonical windows frame layout, reverse the list of
3429 // callee saved registers to get them laid out by PrologEpilogInserter
3430 // in the right order. (PrologEpilogInserter allocates stack objects top
3431 // down. Windows canonical prologs store higher numbered registers at
3432 // the top, thus have the CSI array start from the highest registers.)
3433 if (NeedsWinCFI)
3434 std::reverse(CSI.begin(), CSI.end());
3435
3436 if (CSI.empty())
3437 return true; // Early exit if no callee saved registers are modified!
3438
3439 // Now that we know which registers need to be saved and restored, allocate
3440 // stack slots for them.
3441 MachineFrameInfo &MFI = MF.getFrameInfo();
3442 auto *AFI = MF.getInfo<AArch64FunctionInfo>();
3443
3444 bool UsesWinAAPCS = isTargetWindows(MF);
3445 if (UsesWinAAPCS && hasFP(MF) && AFI->hasSwiftAsyncContext()) {
3446 int FrameIdx = MFI.CreateStackObject(8, Align(16), true);
3447 AFI->setSwiftAsyncContextFrameIdx(FrameIdx);
3448 if ((unsigned)FrameIdx < MinCSFrameIndex) MinCSFrameIndex = FrameIdx;
3449 if ((unsigned)FrameIdx > MaxCSFrameIndex) MaxCSFrameIndex = FrameIdx;
3450 }
3451
3452 for (auto &CS : CSI) {
3453 Register Reg = CS.getReg();
3454 const TargetRegisterClass *RC = RegInfo->getMinimalPhysRegClass(Reg);
3455
3456 unsigned Size = RegInfo->getSpillSize(*RC);
3457 Align Alignment(RegInfo->getSpillAlign(*RC));
3458 int FrameIdx = MFI.CreateStackObject(Size, Alignment, true);
3459 CS.setFrameIdx(FrameIdx);
3460
3461 if ((unsigned)FrameIdx < MinCSFrameIndex) MinCSFrameIndex = FrameIdx;
3462 if ((unsigned)FrameIdx > MaxCSFrameIndex) MaxCSFrameIndex = FrameIdx;
3463
3464 // Grab 8 bytes below FP for the extended asynchronous frame info.
3465 if (hasFP(MF) && AFI->hasSwiftAsyncContext() && !UsesWinAAPCS &&
3466 Reg == AArch64::FP) {
3467 FrameIdx = MFI.CreateStackObject(8, Alignment, true);
3468 AFI->setSwiftAsyncContextFrameIdx(FrameIdx);
3469 if ((unsigned)FrameIdx < MinCSFrameIndex) MinCSFrameIndex = FrameIdx;
3470 if ((unsigned)FrameIdx > MaxCSFrameIndex) MaxCSFrameIndex = FrameIdx;
3471 }
3472 }
3473 return true;
3474}
3475
3477 const MachineFunction &MF) const {
3479 // If the function has streaming-mode changes, don't scavenge a
3480 // spillslot in the callee-save area, as that might require an
3481 // 'addvl' in the streaming-mode-changing call-sequence when the
3482 // function doesn't use a FP.
3483 if (AFI->hasStreamingModeChanges() && !hasFP(MF))
3484 return false;
3485 return AFI->hasCalleeSaveStackFreeSpace();
3486}
3487
3488/// returns true if there are any SVE callee saves.
3490 int &Min, int &Max) {
3491 Min = std::numeric_limits<int>::max();
3492 Max = std::numeric_limits<int>::min();
3493
3494 if (!MFI.isCalleeSavedInfoValid())
3495 return false;
3496
3497 const std::vector<CalleeSavedInfo> &CSI = MFI.getCalleeSavedInfo();
3498 for (auto &CS : CSI) {
3499 if (AArch64::ZPRRegClass.contains(CS.getReg()) ||
3500 AArch64::PPRRegClass.contains(CS.getReg())) {
3501 assert((Max == std::numeric_limits<int>::min() ||
3502 Max + 1 == CS.getFrameIdx()) &&
3503 "SVE CalleeSaves are not consecutive");
3504
3505 Min = std::min(Min, CS.getFrameIdx());
3506 Max = std::max(Max, CS.getFrameIdx());
3507 }
3508 }
3509 return Min != std::numeric_limits<int>::max();
3510}
3511
3512// Process all the SVE stack objects and determine offsets for each
3513// object. If AssignOffsets is true, the offsets get assigned.
3514// Fills in the first and last callee-saved frame indices into
3515// Min/MaxCSFrameIndex, respectively.
3516// Returns the size of the stack.
3518 int &MinCSFrameIndex,
3519 int &MaxCSFrameIndex,
3520 bool AssignOffsets) {
3521#ifndef NDEBUG
3522 // First process all fixed stack objects.
3523 for (int I = MFI.getObjectIndexBegin(); I != 0; ++I)
3525 "SVE vectors should never be passed on the stack by value, only by "
3526 "reference.");
3527#endif
3528
3529 auto Assign = [&MFI](int FI, int64_t Offset) {
3530 LLVM_DEBUG(dbgs() << "alloc FI(" << FI << ") at SP[" << Offset << "]\n");
3531 MFI.setObjectOffset(FI, Offset);
3532 };
3533
3534 int64_t Offset = 0;
3535
3536 // Then process all callee saved slots.
3537 if (getSVECalleeSaveSlotRange(MFI, MinCSFrameIndex, MaxCSFrameIndex)) {
3538 // Assign offsets to the callee save slots.
3539 for (int I = MinCSFrameIndex; I <= MaxCSFrameIndex; ++I) {
3540 Offset += MFI.getObjectSize(I);
3542 if (AssignOffsets)
3543 Assign(I, -Offset);
3544 }
3545 }
3546
3547 // Ensure that the Callee-save area is aligned to 16bytes.
3548 Offset = alignTo(Offset, Align(16U));
3549
3550 // Create a buffer of SVE objects to allocate and sort it.
3551 SmallVector<int, 8> ObjectsToAllocate;
3552 // If we have a stack protector, and we've previously decided that we have SVE
3553 // objects on the stack and thus need it to go in the SVE stack area, then it
3554 // needs to go first.
3555 int StackProtectorFI = -1;
3556 if (MFI.hasStackProtectorIndex()) {
3557 StackProtectorFI = MFI.getStackProtectorIndex();
3558 if (MFI.getStackID(StackProtectorFI) == TargetStackID::ScalableVector)
3559 ObjectsToAllocate.push_back(StackProtectorFI);
3560 }
3561 for (int I = 0, E = MFI.getObjectIndexEnd(); I != E; ++I) {
3562 unsigned StackID = MFI.getStackID(I);
3563 if (StackID != TargetStackID::ScalableVector)
3564 continue;
3565 if (I == StackProtectorFI)
3566 continue;
3567 if (MaxCSFrameIndex >= I && I >= MinCSFrameIndex)
3568 continue;
3569 if (MFI.isDeadObjectIndex(I))
3570 continue;
3571
3572 ObjectsToAllocate.push_back(I);
3573 }
3574
3575 // Allocate all SVE locals and spills
3576 for (unsigned FI : ObjectsToAllocate) {
3577 Align Alignment = MFI.getObjectAlign(FI);
3578 // FIXME: Given that the length of SVE vectors is not necessarily a power of
3579 // two, we'd need to align every object dynamically at runtime if the
3580 // alignment is larger than 16. This is not yet supported.
3581 if (Alignment > Align(16))
3583 "Alignment of scalable vectors > 16 bytes is not yet supported");
3584
3585 Offset = alignTo(Offset + MFI.getObjectSize(FI), Alignment);
3586 if (AssignOffsets)
3587 Assign(FI, -Offset);
3588 }
3589
3590 return Offset;
3591}
3592
3593int64_t AArch64FrameLowering::estimateSVEStackObjectOffsets(
3594 MachineFrameInfo &MFI) const {
3595 int MinCSFrameIndex, MaxCSFrameIndex;
3596 return determineSVEStackObjectOffsets(MFI, MinCSFrameIndex, MaxCSFrameIndex, false);
3597}
3598
3599int64_t AArch64FrameLowering::assignSVEStackObjectOffsets(
3600 MachineFrameInfo &MFI, int &MinCSFrameIndex, int &MaxCSFrameIndex) const {
3601 return determineSVEStackObjectOffsets(MFI, MinCSFrameIndex, MaxCSFrameIndex,
3602 true);
3603}
3604
3606 MachineFunction &MF, RegScavenger *RS) const {
3607 MachineFrameInfo &MFI = MF.getFrameInfo();
3608
3610 "Upwards growing stack unsupported");
3611
3612 int MinCSFrameIndex, MaxCSFrameIndex;
3613 int64_t SVEStackSize =
3614 assignSVEStackObjectOffsets(MFI, MinCSFrameIndex, MaxCSFrameIndex);
3615
3617 AFI->setStackSizeSVE(alignTo(SVEStackSize, 16U));
3618 AFI->setMinMaxSVECSFrameIndex(MinCSFrameIndex, MaxCSFrameIndex);
3619
3620 // If this function isn't doing Win64-style C++ EH, we don't need to do
3621 // anything.
3622 if (!MF.hasEHFunclets())
3623 return;
3625 WinEHFuncInfo &EHInfo = *MF.getWinEHFuncInfo();
3626
3627 MachineBasicBlock &MBB = MF.front();
3628 auto MBBI = MBB.begin();
3629 while (MBBI != MBB.end() && MBBI->getFlag(MachineInstr::FrameSetup))
3630 ++MBBI;
3631
3632 // Create an UnwindHelp object.
3633 // The UnwindHelp object is allocated at the start of the fixed object area
3634 int64_t FixedObject =
3635 getFixedObjectSize(MF, AFI, /*IsWin64*/ true, /*IsFunclet*/ false);
3636 int UnwindHelpFI = MFI.CreateFixedObject(/*Size*/ 8,
3637 /*SPOffset*/ -FixedObject,
3638 /*IsImmutable=*/false);
3639 EHInfo.UnwindHelpFrameIdx = UnwindHelpFI;
3640
3641 // We need to store -2 into the UnwindHelp object at the start of the
3642 // function.
3643 DebugLoc DL;
3645 RS->backward(MBBI);
3646 Register DstReg = RS->FindUnusedReg(&AArch64::GPR64commonRegClass);
3647 assert(DstReg && "There must be a free register after frame setup");
3648 BuildMI(MBB, MBBI, DL, TII.get(AArch64::MOVi64imm), DstReg).addImm(-2);
3649 BuildMI(MBB, MBBI, DL, TII.get(AArch64::STURXi))
3650 .addReg(DstReg, getKillRegState(true))
3651 .addFrameIndex(UnwindHelpFI)
3652 .addImm(0);
3653}
3654
3655namespace {
3656struct TagStoreInstr {
3658 int64_t Offset, Size;
3659 explicit TagStoreInstr(MachineInstr *MI, int64_t Offset, int64_t Size)
3660 : MI(MI), Offset(Offset), Size(Size) {}
3661};
3662
3663class TagStoreEdit {
3664 MachineFunction *MF;
3667 // Tag store instructions that are being replaced.
3669 // Combined memref arguments of the above instructions.
3671
3672 // Replace allocation tags in [FrameReg + FrameRegOffset, FrameReg +
3673 // FrameRegOffset + Size) with the address tag of SP.
3674 Register FrameReg;
3675 StackOffset FrameRegOffset;
3676 int64_t Size;
3677 // If not std::nullopt, move FrameReg to (FrameReg + FrameRegUpdate) at the
3678 // end.
3679 std::optional<int64_t> FrameRegUpdate;
3680 // MIFlags for any FrameReg updating instructions.
3681 unsigned FrameRegUpdateFlags;
3682
3683 // Use zeroing instruction variants.
3684 bool ZeroData;
3685 DebugLoc DL;
3686
3687 void emitUnrolled(MachineBasicBlock::iterator InsertI);
3688 void emitLoop(MachineBasicBlock::iterator InsertI);
3689
3690public:
3691 TagStoreEdit(MachineBasicBlock *MBB, bool ZeroData)
3692 : MBB(MBB), ZeroData(ZeroData) {
3693 MF = MBB->getParent();
3694 MRI = &MF->getRegInfo();
3695 }
3696 // Add an instruction to be replaced. Instructions must be added in the
3697 // ascending order of Offset, and have to be adjacent.
3698 void addInstruction(TagStoreInstr I) {
3699 assert((TagStores.empty() ||
3700 TagStores.back().Offset + TagStores.back().Size == I.Offset) &&
3701 "Non-adjacent tag store instructions.");
3702 TagStores.push_back(I);
3703 }
3704 void clear() { TagStores.clear(); }
3705 // Emit equivalent code at the given location, and erase the current set of
3706 // instructions. May skip if the replacement is not profitable. May invalidate
3707 // the input iterator and replace it with a valid one.
3708 void emitCode(MachineBasicBlock::iterator &InsertI,
3709 const AArch64FrameLowering *TFI, bool TryMergeSPUpdate);
3710};
3711
3712void TagStoreEdit::emitUnrolled(MachineBasicBlock::iterator InsertI) {
3713 const AArch64InstrInfo *TII =
3714 MF->getSubtarget<AArch64Subtarget>().getInstrInfo();
3715
3716 const int64_t kMinOffset = -256 * 16;
3717 const int64_t kMaxOffset = 255 * 16;
3718
3719 Register BaseReg = FrameReg;
3720 int64_t BaseRegOffsetBytes = FrameRegOffset.getFixed();
3721 if (BaseRegOffsetBytes < kMinOffset ||
3722 BaseRegOffsetBytes + (Size - Size % 32) > kMaxOffset ||
3723 // BaseReg can be FP, which is not necessarily aligned to 16-bytes. In
3724 // that case, BaseRegOffsetBytes will not be aligned to 16 bytes, which
3725 // is required for the offset of ST2G.
3726 BaseRegOffsetBytes % 16 != 0) {
3727 Register ScratchReg = MRI->createVirtualRegister(&AArch64::GPR64RegClass);
3728 emitFrameOffset(*MBB, InsertI, DL, ScratchReg, BaseReg,
3729 StackOffset::getFixed(BaseRegOffsetBytes), TII);
3730 BaseReg = ScratchReg;
3731 BaseRegOffsetBytes = 0;
3732 }
3733
3734 MachineInstr *LastI = nullptr;
3735 while (Size) {
3736 int64_t InstrSize = (Size > 16) ? 32 : 16;
3737 unsigned Opcode =
3738 InstrSize == 16
3739 ? (ZeroData ? AArch64::STZGi : AArch64::STGi)
3740 : (ZeroData ? AArch64::STZ2Gi : AArch64::ST2Gi);
3741 assert(BaseRegOffsetBytes % 16 == 0);
3742 MachineInstr *I = BuildMI(*MBB, InsertI, DL, TII->get(Opcode))
3743 .addReg(AArch64::SP)
3744 .addReg(BaseReg)
3745 .addImm(BaseRegOffsetBytes / 16)
3746 .setMemRefs(CombinedMemRefs);
3747 // A store to [BaseReg, #0] should go last for an opportunity to fold the
3748 // final SP adjustment in the epilogue.
3749 if (BaseRegOffsetBytes == 0)
3750 LastI = I;
3751 BaseRegOffsetBytes += InstrSize;
3752 Size -= InstrSize;
3753 }
3754
3755 if (LastI)
3756 MBB->splice(InsertI, MBB, LastI);
3757}
3758
3759void TagStoreEdit::emitLoop(MachineBasicBlock::iterator InsertI) {
3760 const AArch64InstrInfo *TII =
3761 MF->getSubtarget<AArch64Subtarget>().getInstrInfo();
3762
3763 Register BaseReg = FrameRegUpdate
3764 ? FrameReg
3765 : MRI->createVirtualRegister(&AArch64::GPR64RegClass);
3766 Register SizeReg = MRI->createVirtualRegister(&AArch64::GPR64RegClass);
3767
3768 emitFrameOffset(*MBB, InsertI, DL, BaseReg, FrameReg, FrameRegOffset, TII);
3769
3770 int64_t LoopSize = Size;
3771 // If the loop size is not a multiple of 32, split off one 16-byte store at
3772 // the end to fold BaseReg update into.
3773 if (FrameRegUpdate && *FrameRegUpdate)
3774 LoopSize -= LoopSize % 32;
3775 MachineInstr *LoopI = BuildMI(*MBB, InsertI, DL,
3776 TII->get(ZeroData ? AArch64::STZGloop_wback
3777 : AArch64::STGloop_wback))
3778 .addDef(SizeReg)
3779 .addDef(BaseReg)
3780 .addImm(LoopSize)
3781 .addReg(BaseReg)
3782 .setMemRefs(CombinedMemRefs);
3783 if (FrameRegUpdate)
3784 LoopI->setFlags(FrameRegUpdateFlags);
3785
3786 int64_t ExtraBaseRegUpdate =
3787 FrameRegUpdate ? (*FrameRegUpdate - FrameRegOffset.getFixed() - Size) : 0;
3788 if (LoopSize < Size) {
3789 assert(FrameRegUpdate);
3790 assert(Size - LoopSize == 16);
3791 // Tag 16 more bytes at BaseReg and update BaseReg.
3792 BuildMI(*MBB, InsertI, DL,
3793 TII->get(ZeroData ? AArch64::STZGPostIndex : AArch64::STGPostIndex))
3794 .addDef(BaseReg)
3795 .addReg(BaseReg)
3796 .addReg(BaseReg)
3797 .addImm(1 + ExtraBaseRegUpdate / 16)
3798 .setMemRefs(CombinedMemRefs)
3799 .setMIFlags(FrameRegUpdateFlags);
3800 } else if (ExtraBaseRegUpdate) {
3801 // Update BaseReg.
3802 BuildMI(
3803 *MBB, InsertI, DL,
3804 TII->get(ExtraBaseRegUpdate > 0 ? AArch64::ADDXri : AArch64::SUBXri))
3805 .addDef(BaseReg)
3806 .addReg(BaseReg)
3807 .addImm(std::abs(ExtraBaseRegUpdate))
3808 .addImm(0)
3809 .setMIFlags(FrameRegUpdateFlags);
3810 }
3811}
3812
3813// Check if *II is a register update that can be merged into STGloop that ends
3814// at (Reg + Size). RemainingOffset is the required adjustment to Reg after the
3815// end of the loop.
3816bool canMergeRegUpdate(MachineBasicBlock::iterator II, unsigned Reg,
3817 int64_t Size, int64_t *TotalOffset) {
3818 MachineInstr &MI = *II;
3819 if ((MI.getOpcode() == AArch64::ADDXri ||
3820 MI.getOpcode() == AArch64::SUBXri) &&
3821 MI.getOperand(0).getReg() == Reg && MI.getOperand(1).getReg() == Reg) {
3822 unsigned Shift = AArch64_AM::getShiftValue(MI.getOperand(3).getImm());
3823 int64_t Offset = MI.getOperand(2).getImm() << Shift;
3824 if (MI.getOpcode() == AArch64::SUBXri)
3825 Offset = -Offset;
3826 int64_t AbsPostOffset = std::abs(Offset - Size);
3827 const int64_t kMaxOffset =
3828 0xFFF; // Max encoding for unshifted ADDXri / SUBXri
3829 if (AbsPostOffset <= kMaxOffset && AbsPostOffset % 16 == 0) {
3830 *TotalOffset = Offset;
3831 return true;
3832 }
3833 }
3834 return false;
3835}
3836
3837void mergeMemRefs(const SmallVectorImpl<TagStoreInstr> &TSE,
3839 MemRefs.clear();
3840 for (auto &TS : TSE) {
3841 MachineInstr *MI = TS.MI;
3842 // An instruction without memory operands may access anything. Be
3843 // conservative and return an empty list.
3844 if (MI->memoperands_empty()) {
3845 MemRefs.clear();
3846 return;
3847 }
3848 MemRefs.append(MI->memoperands_begin(), MI->memoperands_end());
3849 }
3850}
3851
3852void TagStoreEdit::emitCode(MachineBasicBlock::iterator &InsertI,
3853 const AArch64FrameLowering *TFI,
3854 bool TryMergeSPUpdate) {
3855 if (TagStores.empty())
3856 return;
3857 TagStoreInstr &FirstTagStore = TagStores[0];
3858 TagStoreInstr &LastTagStore = TagStores[TagStores.size() - 1];
3859 Size = LastTagStore.Offset - FirstTagStore.Offset + LastTagStore.Size;
3860 DL = TagStores[0].MI->getDebugLoc();
3861
3862 Register Reg;
3863 FrameRegOffset = TFI->resolveFrameOffsetReference(
3864 *MF, FirstTagStore.Offset, false /*isFixed*/, false /*isSVE*/, Reg,
3865 /*PreferFP=*/false, /*ForSimm=*/true);
3866 FrameReg = Reg;
3867 FrameRegUpdate = std::nullopt;
3868
3869 mergeMemRefs(TagStores, CombinedMemRefs);
3870
3871 LLVM_DEBUG(dbgs() << "Replacing adjacent STG instructions:\n";
3872 for (const auto &Instr
3873 : TagStores) { dbgs() << " " << *Instr.MI; });
3874
3875 // Size threshold where a loop becomes shorter than a linear sequence of
3876 // tagging instructions.
3877 const int kSetTagLoopThreshold = 176;
3878 if (Size < kSetTagLoopThreshold) {
3879 if (TagStores.size() < 2)
3880 return;
3881 emitUnrolled(InsertI);
3882 } else {
3883 MachineInstr *UpdateInstr = nullptr;
3884 int64_t TotalOffset = 0;
3885 if (TryMergeSPUpdate) {
3886 // See if we can merge base register update into the STGloop.
3887 // This is done in AArch64LoadStoreOptimizer for "normal" stores,
3888 // but STGloop is way too unusual for that, and also it only
3889 // realistically happens in function epilogue. Also, STGloop is expanded
3890 // before that pass.
3891 if (InsertI != MBB->end() &&
3892 canMergeRegUpdate(InsertI, FrameReg, FrameRegOffset.getFixed() + Size,
3893 &TotalOffset)) {
3894 UpdateInstr = &*InsertI++;
3895 LLVM_DEBUG(dbgs() << "Folding SP update into loop:\n "
3896 << *UpdateInstr);
3897 }
3898 }
3899
3900 if (!UpdateInstr && TagStores.size() < 2)
3901 return;
3902
3903 if (UpdateInstr) {
3904 FrameRegUpdate = TotalOffset;
3905 FrameRegUpdateFlags = UpdateInstr->getFlags();
3906 }
3907 emitLoop(InsertI);
3908 if (UpdateInstr)
3909 UpdateInstr->eraseFromParent();
3910 }
3911
3912 for (auto &TS : TagStores)
3913 TS.MI->eraseFromParent();
3914}
3915
3916bool isMergeableStackTaggingInstruction(MachineInstr &MI, int64_t &Offset,
3917 int64_t &Size, bool &ZeroData) {
3918 MachineFunction &MF = *MI.getParent()->getParent();
3919 const MachineFrameInfo &MFI = MF.getFrameInfo();
3920
3921 unsigned Opcode = MI.getOpcode();
3922 ZeroData = (Opcode == AArch64::STZGloop || Opcode == AArch64::STZGi ||
3923 Opcode == AArch64::STZ2Gi);
3924
3925 if (Opcode == AArch64::STGloop || Opcode == AArch64::STZGloop) {
3926 if (!MI.getOperand(0).isDead() || !MI.getOperand(1).isDead())
3927 return false;
3928 if (!MI.getOperand(2).isImm() || !MI.getOperand(3).isFI())
3929 return false;
3930 Offset = MFI.getObjectOffset(MI.getOperand(3).getIndex());
3931 Size = MI.getOperand(2).getImm();
3932 return true;
3933 }
3934
3935 if (Opcode == AArch64::STGi || Opcode == AArch64::STZGi)
3936 Size = 16;
3937 else if (Opcode == AArch64::ST2Gi || Opcode == AArch64::STZ2Gi)
3938 Size = 32;
3939 else
3940 return false;
3941
3942 if (MI.getOperand(0).getReg() != AArch64::SP || !MI.getOperand(1).isFI())
3943 return false;
3944
3945 Offset = MFI.getObjectOffset(MI.getOperand(1).getIndex()) +
3946 16 * MI.getOperand(2).getImm();
3947 return true;
3948}
3949
3950// Detect a run of memory tagging instructions for adjacent stack frame slots,
3951// and replace them with a shorter instruction sequence:
3952// * replace STG + STG with ST2G
3953// * replace STGloop + STGloop with STGloop
3954// This code needs to run when stack slot offsets are already known, but before
3955// FrameIndex operands in STG instructions are eliminated.
3957 const AArch64FrameLowering *TFI,
3958 RegScavenger *RS) {
3959 bool FirstZeroData;
3960 int64_t Size, Offset;
3961 MachineInstr &MI = *II;
3962 MachineBasicBlock *MBB = MI.getParent();
3963 MachineBasicBlock::iterator NextI = ++II;
3964 if (&MI == &MBB->instr_back())
3965 return II;
3966 if (!isMergeableStackTaggingInstruction(MI, Offset, Size, FirstZeroData))
3967 return II;
3968
3970 Instrs.emplace_back(&MI, Offset, Size);
3971
3972 constexpr int kScanLimit = 10;
3973 int Count = 0;
3975 NextI != E && Count < kScanLimit; ++NextI) {
3976 MachineInstr &MI = *NextI;
3977 bool ZeroData;
3978 int64_t Size, Offset;
3979 // Collect instructions that update memory tags with a FrameIndex operand
3980 // and (when applicable) constant size, and whose output registers are dead
3981 // (the latter is almost always the case in practice). Since these
3982 // instructions effectively have no inputs or outputs, we are free to skip
3983 // any non-aliasing instructions in between without tracking used registers.
3984 if (isMergeableStackTaggingInstruction(MI, Offset, Size, ZeroData)) {
3985 if (ZeroData != FirstZeroData)
3986 break;
3987 Instrs.emplace_back(&MI, Offset, Size);
3988 continue;
3989 }
3990
3991 // Only count non-transient, non-tagging instructions toward the scan
3992 // limit.
3993 if (!MI.isTransient())
3994 ++Count;
3995
3996 // Just in case, stop before the epilogue code starts.
3997 if (MI.getFlag(MachineInstr::FrameSetup) ||
3999 break;
4000
4001 // Reject anything that may alias the collected instructions.
4002 if (MI.mayLoadOrStore() || MI.hasUnmodeledSideEffects())
4003 break;
4004 }
4005
4006 // New code will be inserted after the last tagging instruction we've found.
4007 MachineBasicBlock::iterator InsertI = Instrs.back().MI;
4008
4009 // All the gathered stack tag instructions are merged and placed after
4010 // last tag store in the list. The check should be made if the nzcv
4011 // flag is live at the point where we are trying to insert. Otherwise
4012 // the nzcv flag might get clobbered if any stg loops are present.
4013
4014 // FIXME : This approach of bailing out from merge is conservative in
4015 // some ways like even if stg loops are not present after merge the
4016 // insert list, this liveness check is done (which is not needed).
4018 LiveRegs.addLiveOuts(*MBB);
4019 for (auto I = MBB->rbegin();; ++I) {
4020 MachineInstr &MI = *I;
4021 if (MI == InsertI)
4022 break;
4023 LiveRegs.stepBackward(*I);
4024 }
4025 InsertI++;
4026 if (LiveRegs.contains(AArch64::NZCV))
4027 return InsertI;
4028
4029 llvm::stable_sort(Instrs,
4030 [](const TagStoreInstr &Left, const TagStoreInstr &Right) {
4031 return Left.Offset < Right.Offset;
4032 });
4033
4034 // Make sure that we don't have any overlapping stores.
4035 int64_t CurOffset = Instrs[0].Offset;
4036 for (auto &Instr : Instrs) {
4037 if (CurOffset > Instr.Offset)
4038 return NextI;
4039 CurOffset = Instr.Offset + Instr.Size;
4040 }
4041
4042 // Find contiguous runs of tagged memory and emit shorter instruction
4043 // sequencies for them when possible.
4044 TagStoreEdit TSE(MBB, FirstZeroData);
4045 std::optional<int64_t> EndOffset;
4046 for (auto &Instr : Instrs) {
4047 if (EndOffset && *EndOffset != Instr.Offset) {
4048 // Found a gap.
4049 TSE.emitCode(InsertI, TFI, /*TryMergeSPUpdate = */ false);
4050 TSE.clear();
4051 }
4052
4053 TSE.addInstruction(Instr);
4054 EndOffset = Instr.Offset + Instr.Size;
4055 }
4056
4057 const MachineFunction *MF = MBB->getParent();
4058 // Multiple FP/SP updates in a loop cannot be described by CFI instructions.
4059 TSE.emitCode(
4060 InsertI, TFI, /*TryMergeSPUpdate = */
4062
4063 return InsertI;
4064}
4065} // namespace
4066
4068 MachineFunction &MF, RegScavenger *RS = nullptr) const {
4070 for (auto &BB : MF)
4071 for (MachineBasicBlock::iterator II = BB.begin(); II != BB.end();)
4072 II = tryMergeAdjacentSTG(II, this, RS);
4073}
4074
4075/// For Win64 AArch64 EH, the offset to the Unwind object is from the SP
4076/// before the update. This is easily retrieved as it is exactly the offset
4077/// that is set in processFunctionBeforeFrameFinalized.
4079 const MachineFunction &MF, int FI, Register &FrameReg,
4080 bool IgnoreSPUpdates) const {
4081 const MachineFrameInfo &MFI = MF.getFrameInfo();
4082 if (IgnoreSPUpdates) {
4083 LLVM_DEBUG(dbgs() << "Offset from the SP for " << FI << " is "
4084 << MFI.getObjectOffset(FI) << "\n");
4085 FrameReg = AArch64::SP;
4086 return StackOffset::getFixed(MFI.getObjectOffset(FI));
4087 }
4088
4089 // Go to common code if we cannot provide sp + offset.
4090 if (MFI.hasVarSizedObjects() ||
4093 return getFrameIndexReference(MF, FI, FrameReg);
4094
4095 FrameReg = AArch64::SP;
4096 return getStackOffset(MF, MFI.getObjectOffset(FI));
4097}
4098
4099/// The parent frame offset (aka dispFrame) is only used on X86_64 to retrieve
4100/// the parent's frame pointer
4102 const MachineFunction &MF) const {
4103 return 0;
4104}
4105
4106/// Funclets only need to account for space for the callee saved registers,
4107/// as the locals are accounted for in the parent's stack frame.
4109 const MachineFunction &MF) const {
4110 // This is the size of the pushed CSRs.
4111 unsigned CSSize =
4112 MF.getInfo<AArch64FunctionInfo>()->getCalleeSavedStackSize();
4113 // This is the amount of stack a funclet needs to allocate.
4114 return alignTo(CSSize + MF.getFrameInfo().getMaxCallFrameSize(),
4115 getStackAlign());
4116}
4117
4118namespace {
4119struct FrameObject {
4120 bool IsValid = false;
4121 // Index of the object in MFI.
4122 int ObjectIndex = 0;
4123 // Group ID this object belongs to.
4124 int GroupIndex = -1;
4125 // This object should be placed first (closest to SP).
4126 bool ObjectFirst = false;
4127 // This object's group (which always contains the object with
4128 // ObjectFirst==true) should be placed first.
4129 bool GroupFirst = false;
4130};
4131
4132class GroupBuilder {
4133 SmallVector<int, 8> CurrentMembers;
4134 int NextGroupIndex = 0;
4135 std::vector<FrameObject> &Objects;
4136
4137public:
4138 GroupBuilder(std::vector<FrameObject> &Objects) : Objects(Objects) {}
4139 void AddMember(int Index) { CurrentMembers.push_back(Index); }
4140 void EndCurrentGroup() {
4141 if (CurrentMembers.size() > 1) {
4142 // Create a new group with the current member list. This might remove them
4143 // from their pre-existing groups. That's OK, dealing with overlapping
4144 // groups is too hard and unlikely to make a difference.
4145 LLVM_DEBUG(dbgs() << "group:");
4146 for (int Index : CurrentMembers) {
4147 Objects[Index].GroupIndex = NextGroupIndex;
4148 LLVM_DEBUG(dbgs() << " " << Index);
4149 }
4150 LLVM_DEBUG(dbgs() << "\n");
4151 NextGroupIndex++;
4152 }
4153 CurrentMembers.clear();
4154 }
4155};
4156
4157bool FrameObjectCompare(const FrameObject &A, const FrameObject &B) {
4158 // Objects at a lower index are closer to FP; objects at a higher index are
4159 // closer to SP.
4160 //
4161 // For consistency in our comparison, all invalid objects are placed
4162 // at the end. This also allows us to stop walking when we hit the
4163 // first invalid item after it's all sorted.
4164 //
4165 // The "first" object goes first (closest to SP), followed by the members of
4166 // the "first" group.
4167 //
4168 // The rest are sorted by the group index to keep the groups together.
4169 // Higher numbered groups are more likely to be around longer (i.e. untagged
4170 // in the function epilogue and not at some earlier point). Place them closer
4171 // to SP.
4172 //
4173 // If all else equal, sort by the object index to keep the objects in the
4174 // original order.
4175 return std::make_tuple(!A.IsValid, A.ObjectFirst, A.GroupFirst, A.GroupIndex,
4176 A.ObjectIndex) <
4177 std::make_tuple(!B.IsValid, B.ObjectFirst, B.GroupFirst, B.GroupIndex,
4178 B.ObjectIndex);
4179}
4180} // namespace
4181
4183 const MachineFunction &MF, SmallVectorImpl<int> &ObjectsToAllocate) const {
4184 if (!OrderFrameObjects || ObjectsToAllocate.empty())
4185 return;
4186
4187 const MachineFrameInfo &MFI = MF.getFrameInfo();
4188 std::vector<FrameObject> FrameObjects(MFI.getObjectIndexEnd());
4189 for (auto &Obj : ObjectsToAllocate) {
4190 FrameObjects[Obj].IsValid = true;
4191 FrameObjects[Obj].ObjectIndex = Obj;
4192 }
4193
4194 // Identify stack slots that are tagged at the same time.
4195 GroupBuilder GB(FrameObjects);
4196 for (auto &MBB : MF) {
4197 for (auto &MI : MBB) {
4198 if (MI.isDebugInstr())
4199 continue;
4200 int OpIndex;
4201 switch (MI.getOpcode()) {
4202 case AArch64::STGloop:
4203 case AArch64::STZGloop:
4204 OpIndex = 3;
4205 break;
4206 case AArch64::STGi:
4207 case AArch64::STZGi:
4208 case AArch64::ST2Gi:
4209 case AArch64::STZ2Gi:
4210 OpIndex = 1;
4211 break;
4212 default:
4213 OpIndex = -1;
4214 }
4215
4216 int TaggedFI = -1;
4217 if (OpIndex >= 0) {
4218 const MachineOperand &MO = MI.getOperand(OpIndex);
4219 if (MO.isFI()) {
4220 int FI = MO.getIndex();
4221 if (FI >= 0 && FI < MFI.getObjectIndexEnd() &&
4222 FrameObjects[FI].IsValid)
4223 TaggedFI = FI;
4224 }
4225 }
4226
4227 // If this is a stack tagging instruction for a slot that is not part of a
4228 // group yet, either start a new group or add it to the current one.
4229 if (TaggedFI >= 0)
4230 GB.AddMember(TaggedFI);
4231 else
4232 GB.EndCurrentGroup();
4233 }
4234 // Groups should never span multiple basic blocks.
4235 GB.EndCurrentGroup();
4236 }
4237
4238 // If the function's tagged base pointer is pinned to a stack slot, we want to
4239 // put that slot first when possible. This will likely place it at SP + 0,
4240 // and save one instruction when generating the base pointer because IRG does
4241 // not allow an immediate offset.
4243 std::optional<int> TBPI = AFI.getTaggedBasePointerIndex();
4244 if (TBPI) {
4245 FrameObjects[*TBPI].ObjectFirst = true;
4246 FrameObjects[*TBPI].GroupFirst = true;
4247 int FirstGroupIndex = FrameObjects[*TBPI].GroupIndex;
4248 if (FirstGroupIndex >= 0)
4249 for (FrameObject &Object : FrameObjects)
4250 if (Object.GroupIndex == FirstGroupIndex)
4251 Object.GroupFirst = true;
4252 }
4253
4254 llvm::stable_sort(FrameObjects, FrameObjectCompare);
4255
4256 int i = 0;
4257 for (auto &Obj : FrameObjects) {
4258 // All invalid items are sorted at the end, so it's safe to stop.
4259 if (!Obj.IsValid)
4260 break;
4261 ObjectsToAllocate[i++] = Obj.ObjectIndex;
4262 }
4263
4264 LLVM_DEBUG(dbgs() << "Final frame order:\n"; for (auto &Obj
4265 : FrameObjects) {
4266 if (!Obj.IsValid)
4267 break;
4268 dbgs() << " " << Obj.ObjectIndex << ": group " << Obj.GroupIndex;
4269 if (Obj.ObjectFirst)
4270 dbgs() << ", first";
4271 if (Obj.GroupFirst)
4272 dbgs() << ", group-first";
4273 dbgs() << "\n";
4274 });
4275}
4276
4277/// Emit a loop to decrement SP until it is equal to TargetReg, with probes at
4278/// least every ProbeSize bytes. Returns an iterator of the first instruction
4279/// after the loop. The difference between SP and TargetReg must be an exact
4280/// multiple of ProbeSize.
4282AArch64FrameLowering::inlineStackProbeLoopExactMultiple(
4283 MachineBasicBlock::iterator MBBI, int64_t ProbeSize,
4284 Register TargetReg) const {
4286 MachineFunction &MF = *MBB.getParent();
4287 const AArch64InstrInfo *TII =
4288 MF.getSubtarget<AArch64Subtarget>().getInstrInfo();
4290
4291 MachineFunction::iterator MBBInsertPoint = std::next(MBB.getIterator());
4293 MF.insert(MBBInsertPoint, LoopMBB);
4295 MF.insert(MBBInsertPoint, ExitMBB);
4296
4297 // SUB SP, SP, #ProbeSize (or equivalent if ProbeSize is not encodable
4298 // in SUB).
4299 emitFrameOffset(*LoopMBB, LoopMBB->end(), DL, AArch64::SP, AArch64::SP,
4300 StackOffset::getFixed(-ProbeSize), TII,
4302 // STR XZR, [SP]
4303 BuildMI(*LoopMBB, LoopMBB->end(), DL, TII->get(AArch64::STRXui))
4304 .addReg(AArch64::XZR)
4305 .addReg(AArch64::SP)
4306 .addImm(0)
4308 // CMP SP, TargetReg
4309 BuildMI(*LoopMBB, LoopMBB->end(), DL, TII->get(AArch64::SUBSXrx64),
4310 AArch64::XZR)
4311 .addReg(AArch64::SP)
4312 .addReg(TargetReg)
4315 // B.CC Loop
4316 BuildMI(*LoopMBB, LoopMBB->end(), DL, TII->get(AArch64::Bcc))
4318 .addMBB(LoopMBB)
4320
4321 LoopMBB->addSuccessor(ExitMBB);
4322 LoopMBB->addSuccessor(LoopMBB);
4323 // Synthesize the exit MBB.
4324 ExitMBB->splice(ExitMBB->end(), &MBB, MBBI, MBB.end());
4326 MBB.addSuccessor(LoopMBB);
4327 // Update liveins.
4328 bool anyChange = false;
4329 do {
4330 anyChange = recomputeLiveIns(*ExitMBB) || recomputeLiveIns(*LoopMBB);
4331 } while (anyChange);
4332
4333 return ExitMBB->begin();
4334}
4335
4336void AArch64FrameLowering::inlineStackProbeFixed(
4337 MachineBasicBlock::iterator MBBI, Register ScratchReg, int64_t FrameSize,
4338 StackOffset CFAOffset) const {
4340 MachineFunction &MF = *MBB->getParent();
4341 const AArch64InstrInfo *TII =
4342 MF.getSubtarget<AArch64Subtarget>().getInstrInfo();
4344 bool EmitAsyncCFI = AFI->needsAsyncDwarfUnwindInfo(MF);
4345 bool HasFP = hasFP(MF);
4346
4347 DebugLoc DL;
4348 int64_t ProbeSize = MF.getInfo<AArch64FunctionInfo>()->getStackProbeSize();
4349 int64_t NumBlocks = FrameSize / ProbeSize;
4350 int64_t ResidualSize = FrameSize % ProbeSize;
4351
4352 LLVM_DEBUG(dbgs() << "Stack probing: total " << FrameSize << " bytes, "
4353 << NumBlocks << " blocks of " << ProbeSize
4354 << " bytes, plus " << ResidualSize << " bytes\n");
4355
4356 // Decrement SP by NumBlock * ProbeSize bytes, with either unrolled or
4357 // ordinary loop.
4358 if (NumBlocks <= AArch64::StackProbeMaxLoopUnroll) {
4359 for (int i = 0; i < NumBlocks; ++i) {
4360 // SUB SP, SP, #ProbeSize (or equivalent if ProbeSize is not
4361 // encodable in a SUB).
4362 emitFrameOffset(*MBB, MBBI, DL, AArch64::SP, AArch64::SP,
4363 StackOffset::getFixed(-ProbeSize), TII,
4364 MachineInstr::FrameSetup, false, false, nullptr,
4365 EmitAsyncCFI && !HasFP, CFAOffset);
4366 CFAOffset += StackOffset::getFixed(ProbeSize);
4367 // STR XZR, [SP]
4368 BuildMI(*MBB, MBBI, DL, TII->get(AArch64::STRXui))
4369 .addReg(AArch64::XZR)
4370 .addReg(AArch64::SP)
4371 .addImm(0)
4373 }
4374 } else if (NumBlocks != 0) {
4375 // SUB ScratchReg, SP, #FrameSize (or equivalent if FrameSize is not
4376 // encodable in ADD). ScrathReg may temporarily become the CFA register.
4377 emitFrameOffset(*MBB, MBBI, DL, ScratchReg, AArch64::SP,
4378 StackOffset::getFixed(-ProbeSize * NumBlocks), TII,
4379 MachineInstr::FrameSetup, false, false, nullptr,
4380 EmitAsyncCFI && !HasFP, CFAOffset);
4381 CFAOffset += StackOffset::getFixed(ProbeSize * NumBlocks);
4382 MBBI = inlineStackProbeLoopExactMultiple(MBBI, ProbeSize, ScratchReg);
4383 MBB = MBBI->getParent();
4384 if (EmitAsyncCFI && !HasFP) {
4385 // Set the CFA register back to SP.
4387 *MF.getSubtarget<AArch64Subtarget>().getRegisterInfo();
4388 unsigned Reg = RegInfo.getDwarfRegNum(AArch64::SP, true);
4389 unsigned CFIIndex =
4391 BuildMI(*MBB, MBBI, DL, TII->get(TargetOpcode::CFI_INSTRUCTION))
4392 .addCFIIndex(CFIIndex)
4394 }
4395 }
4396
4397 if (ResidualSize != 0) {
4398 // SUB SP, SP, #ResidualSize (or equivalent if ResidualSize is not encodable
4399 // in SUB).
4400 emitFrameOffset(*MBB, MBBI, DL, AArch64::SP, AArch64::SP,
4401 StackOffset::getFixed(-ResidualSize), TII,
4402 MachineInstr::FrameSetup, false, false, nullptr,
4403 EmitAsyncCFI && !HasFP, CFAOffset);
4404 if (ResidualSize > AArch64::StackProbeMaxUnprobedStack) {
4405 // STR XZR, [SP]
4406 BuildMI(*MBB, MBBI, DL, TII->get(AArch64::STRXui))
4407 .addReg(AArch64::XZR)
4408 .addReg(AArch64::SP)
4409 .addImm(0)
4411 }
4412 }
4413}
4414
4415void AArch64FrameLowering::inlineStackProbe(MachineFunction &MF,
4416 MachineBasicBlock &MBB) const {
4417 // Get the instructions that need to be replaced. We emit at most two of
4418 // these. Remember them in order to avoid complications coming from the need
4419 // to traverse the block while potentially creating more blocks.
4421 for (MachineInstr &MI : MBB)
4422 if (MI.getOpcode() == AArch64::PROBED_STACKALLOC ||
4423 MI.getOpcode() == AArch64::PROBED_STACKALLOC_VAR)
4424 ToReplace.push_back(&MI);
4425
4426 for (MachineInstr *MI : ToReplace) {
4427 if (MI->getOpcode() == AArch64::PROBED_STACKALLOC) {
4428 Register ScratchReg = MI->getOperand(0).getReg();
4429 int64_t FrameSize = MI->getOperand(1).getImm();
4430 StackOffset CFAOffset = StackOffset::get(MI->getOperand(2).getImm(),
4431 MI->getOperand(3).getImm());
4432 inlineStackProbeFixed(MI->getIterator(), ScratchReg, FrameSize,
4433 CFAOffset);
4434 } else {
4435 assert(MI->getOpcode() == AArch64::PROBED_STACKALLOC_VAR &&
4436 "Stack probe pseudo-instruction expected");
4437 const AArch64InstrInfo *TII =
4438 MI->getMF()->getSubtarget<AArch64Subtarget>().getInstrInfo();
4439 Register TargetReg = MI->getOperand(0).getReg();
4440 (void)TII->probedStackAlloc(MI->getIterator(), TargetReg, true);
4441 }
4442 MI->eraseFromParent();
4443 }
4444}
unsigned const MachineRegisterInfo * MRI
#define Success
for(const MachineOperand &MO :llvm::drop_begin(OldMI.operands(), Desc.getNumOperands()))
static int64_t getArgumentStackToRestore(MachineFunction &MF, MachineBasicBlock &MBB)
Returns how much of the incoming argument stack area (in bytes) we should clean up in an epilogue.
static void emitShadowCallStackEpilogue(const TargetInstrInfo &TII, MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const DebugLoc &DL)
static void getLiveRegsForEntryMBB(LivePhysRegs &LiveRegs, const MachineBasicBlock &MBB)
static void emitCalleeSavedRestores(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, bool SVE)
static void computeCalleeSaveRegisterPairs(MachineFunction &MF, ArrayRef< CalleeSavedInfo > CSI, const TargetRegisterInfo *TRI, SmallVectorImpl< RegPairInfo > &RegPairs, bool NeedsFrameRecord)
static const unsigned DefaultSafeSPDisplacement
This is the biggest offset to the stack pointer we can encode in aarch64 instructions (without using ...
static void emitDefineCFAWithFP(MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const DebugLoc &DL, unsigned FixedObject)
static bool needsWinCFI(const MachineFunction &MF)
static void insertCFISameValue(const MCInstrDesc &Desc, MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator InsertPt, unsigned DwarfReg)
static cl::opt< bool > StackTaggingMergeSetTag("stack-tagging-merge-settag", cl::desc("merge settag instruction in function epilog"), cl::init(true), cl::Hidden)
static bool produceCompactUnwindFrame(MachineFunction &MF)
static int64_t determineSVEStackObjectOffsets(MachineFrameInfo &MFI, int &MinCSFrameIndex, int &MaxCSFrameIndex, bool AssignOffsets)
static cl::opt< bool > OrderFrameObjects("aarch64-order-frame-objects", cl::desc("sort stack allocations"), cl::init(true), cl::Hidden)
static bool windowsRequiresStackProbe(MachineFunction &MF, uint64_t StackSizeInBytes)
static void fixupCalleeSaveRestoreStackOffset(MachineInstr &MI, uint64_t LocalStackSize, bool NeedsWinCFI, bool *HasWinCFI)
static bool invalidateWindowsRegisterPairing(unsigned Reg1, unsigned Reg2, bool NeedsWinCFI, bool IsFirst, const TargetRegisterInfo *TRI)
static MachineBasicBlock::iterator convertCalleeSaveRestoreToSPPrePostIncDec(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const DebugLoc &DL, const TargetInstrInfo *TII, int CSStackSizeInc, bool NeedsWinCFI, bool *HasWinCFI, bool EmitCFI, MachineInstr::MIFlag FrameFlag=MachineInstr::FrameSetup, int CFAOffset=0)
static void fixupSEHOpcode(MachineBasicBlock::iterator MBBI, unsigned LocalStackSize)
static StackOffset getSVEStackSize(const MachineFunction &MF)
Returns the size of the entire SVE stackframe (calleesaves + spills).
static cl::opt< bool > EnableRedZone("aarch64-redzone", cl::desc("enable use of redzone on AArch64"), cl::init(false), cl::Hidden)
static MachineBasicBlock::iterator InsertSEH(MachineBasicBlock::iterator MBBI, const TargetInstrInfo &TII, MachineInstr::MIFlag Flag)
static Register findScratchNonCalleeSaveRegister(MachineBasicBlock *MBB)
static void getLivePhysRegsUpTo(MachineInstr &MI, const TargetRegisterInfo &TRI, LivePhysRegs &LiveRegs)
Collect live registers from the end of MI's parent up to (including) MI in LiveRegs.
cl::opt< bool > EnableHomogeneousPrologEpilog("homogeneous-prolog-epilog", cl::Hidden, cl::desc("Emit homogeneous prologue and epilogue for the size " "optimization (default = off)"))
static bool IsSVECalleeSave(MachineBasicBlock::iterator I)
static bool invalidateRegisterPairing(unsigned Reg1, unsigned Reg2, bool UsesWinAAPCS, bool NeedsWinCFI, bool NeedsFrameRecord, bool IsFirst, const TargetRegisterInfo *TRI)
Returns true if Reg1 and Reg2 cannot be paired using a ldp/stp instruction.
static unsigned getPrologueDeath(MachineFunction &MF, unsigned Reg)
static StackOffset getFPOffset(const MachineFunction &MF, int64_t ObjectOffset)
static bool isTargetWindows(const MachineFunction &MF)
static StackOffset getStackOffset(const MachineFunction &MF, int64_t ObjectOffset)
static int64_t upperBound(StackOffset Size)
static unsigned estimateRSStackSizeLimit(MachineFunction &MF)
Look at each instruction that references stack frames and return the stack size limit beyond which so...
static bool getSVECalleeSaveSlotRange(const MachineFrameInfo &MFI, int &Min, int &Max)
returns true if there are any SVE callee saves.
static MCRegister getRegisterOrZero(MCRegister Reg, bool HasSVE)
static bool isFuncletReturnInstr(const MachineInstr &MI)
static void emitShadowCallStackPrologue(const TargetInstrInfo &TII, MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const DebugLoc &DL, bool NeedsWinCFI, bool NeedsUnwindInfo)
static unsigned getFixedObjectSize(const MachineFunction &MF, const AArch64FunctionInfo *AFI, bool IsWin64, bool IsFunclet)
Returns the size of the fixed object area (allocated next to sp on entry) On Win64 this may include a...
unsigned RegSize
MachineBasicBlock & MBB
MachineBasicBlock MachineBasicBlock::iterator DebugLoc DL
MachineBasicBlock MachineBasicBlock::iterator MBBI
static const int kSetTagLoopThreshold
This file contains the simple types necessary to represent the attributes associated with functions a...
#define CASE(ATTRNAME, AANAME,...)
static GCRegistry::Add< OcamlGC > B("ocaml", "ocaml 3.10-compatible GC")
static GCRegistry::Add< ErlangGC > A("erlang", "erlang-compatible garbage collector")
Analysis containing CSE Info
Definition: CSEInfo.cpp:27
#define LLVM_FALLTHROUGH
LLVM_FALLTHROUGH - Mark fallthrough cases in switch statements.
Definition: Compiler.h:301
static void clear(coro::Shape &Shape)
Definition: Coroutines.cpp:148
#define LLVM_DEBUG(X)
Definition: Debug.h:101
uint64_t Size
bool End
Definition: ELF_riscv.cpp:480
static const HTTPClientCleanup Cleanup
Definition: HTTPClient.cpp:42
const HexagonInstrInfo * TII
IRTranslator LLVM IR MI
This file implements the LivePhysRegs utility for tracking liveness of physical registers.
#define F(x, y, z)
Definition: MD5.cpp:55
#define I(x, y, z)
Definition: MD5.cpp:58
unsigned const TargetRegisterInfo * TRI
if(VerifyEach)
This file declares the machine register scavenger class.
assert(ImpDefSCC.getReg()==AMDGPU::SCC &&ImpDefSCC.isDef())
unsigned OpIndex
This file defines the make_scope_exit function, which executes user-defined cleanup logic at scope ex...
This file defines the SmallVector class.
This file defines the 'Statistic' class, which is designed to be an easy way to expose various metric...
#define STATISTIC(VARNAME, DESC)
Definition: Statistic.h:167
static bool contains(SmallPtrSetImpl< ConstantExpr * > &Cache, ConstantExpr *Expr, Constant *C)
Definition: Value.cpp:469
static const unsigned FramePtr
void processFunctionBeforeFrameIndicesReplaced(MachineFunction &MF, RegScavenger *RS) const override
processFunctionBeforeFrameIndicesReplaced - This method is called immediately before MO_FrameIndex op...
MachineBasicBlock::iterator eliminateCallFramePseudoInstr(MachineFunction &MF, MachineBasicBlock &MBB, MachineBasicBlock::iterator I) const override
This method is called during prolog/epilog code insertion to eliminate call frame setup and destroy p...
bool canUseAsPrologue(const MachineBasicBlock &MBB) const override
Check whether or not the given MBB can be used as a prologue for the target.
bool enableStackSlotScavenging(const MachineFunction &MF) const override
Returns true if the stack slot holes in the fixed and callee-save stack area should be used when allo...
bool spillCalleeSavedRegisters(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI, ArrayRef< CalleeSavedInfo > CSI, const TargetRegisterInfo *TRI) const override
spillCalleeSavedRegisters - Issues instruction(s) to spill all callee saved registers and returns tru...
bool restoreCalleeSavedRegisters(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI, MutableArrayRef< CalleeSavedInfo > CSI, const TargetRegisterInfo *TRI) const override
restoreCalleeSavedRegisters - Issues instruction(s) to restore all callee saved registers and returns...
StackOffset getNonLocalFrameIndexReference(const MachineFunction &MF, int FI) const override
getNonLocalFrameIndexReference - This method returns the offset used to reference a frame index locat...
TargetStackID::Value getStackIDForScalableVectors() const override
Returns the StackID that scalable vectors should be associated with.
bool hasFP(const MachineFunction &MF) const override
hasFP - Return true if the specified function should have a dedicated frame pointer register.
void emitPrologue(MachineFunction &MF, MachineBasicBlock &MBB) const override
emitProlog/emitEpilog - These methods insert prolog and epilog code into the function.
bool enableCFIFixup(MachineFunction &MF) const override
Returns true if we may need to fix the unwind information for the function.
void resetCFIToInitialState(MachineBasicBlock &MBB) const override
Emit CFI instructions that recreate the state of the unwind information upon fucntion entry.
bool hasReservedCallFrame(const MachineFunction &MF) const override
hasReservedCallFrame - Under normal circumstances, when a frame pointer is not required,...
bool canUseRedZone(const MachineFunction &MF) const
Can this function use the red zone for local allocations.
void processFunctionBeforeFrameFinalized(MachineFunction &MF, RegScavenger *RS) const override
processFunctionBeforeFrameFinalized - This method is called immediately before the specified function...
int getSEHFrameIndexOffset(const MachineFunction &MF, int FI) const
unsigned getWinEHFuncletFrameSize(const MachineFunction &MF) const
Funclets only need to account for space for the callee saved registers, as the locals are accounted f...
void orderFrameObjects(const MachineFunction &MF, SmallVectorImpl< int > &ObjectsToAllocate) const override
Order the symbols in the local stack frame.
void emitEpilogue(MachineFunction &MF, MachineBasicBlock &MBB) const override
void determineCalleeSaves(MachineFunction &MF, BitVector &SavedRegs, RegScavenger *RS) const override
This method determines which of the registers reported by TargetRegisterInfo::getCalleeSavedRegs() sh...
StackOffset getFrameIndexReference(const MachineFunction &MF, int FI, Register &FrameReg) const override
getFrameIndexReference - Provide a base+offset reference to an FI slot for debug info.
StackOffset resolveFrameOffsetReference(const MachineFunction &MF, int64_t ObjectOffset, bool isFixed, bool isSVE, Register &FrameReg, bool PreferFP, bool ForSimm) const
bool assignCalleeSavedSpillSlots(MachineFunction &MF, const TargetRegisterInfo *TRI, std::vector< CalleeSavedInfo > &CSI, unsigned &MinCSFrameIndex, unsigned &MaxCSFrameIndex) const override
assignCalleeSavedSpillSlots - Allows target to override spill slot assignment logic.
StackOffset getFrameIndexReferencePreferSP(const MachineFunction &MF, int FI, Register &FrameReg, bool IgnoreSPUpdates) const override
For Win64 AArch64 EH, the offset to the Unwind object is from the SP before the update.
StackOffset resolveFrameIndexReference(const MachineFunction &MF, int FI, Register &FrameReg, bool PreferFP, bool ForSimm) const
unsigned getWinEHParentFrameOffset(const MachineFunction &MF) const override
The parent frame offset (aka dispFrame) is only used on X86_64 to retrieve the parent's frame pointer...
AArch64FunctionInfo - This class is derived from MachineFunctionInfo and contains private AArch64-spe...
bool needsShadowCallStackPrologueEpilogue(MachineFunction &MF) const
unsigned getCalleeSavedStackSize(const MachineFrameInfo &MFI) const
void setCalleeSaveBaseToFrameRecordOffset(int Offset)
bool shouldSignReturnAddress(const MachineFunction &MF) const
std::optional< int > getTaggedBasePointerIndex() const
bool needsDwarfUnwindInfo(const MachineFunction &MF) const
void setTaggedBasePointerOffset(unsigned Offset)
bool needsAsyncDwarfUnwindInfo(const MachineFunction &MF) const
void setMinMaxSVECSFrameIndex(int Min, int Max)
static bool isTailCallReturnInst(const MachineInstr &MI)
Returns true if MI is one of the TCRETURN* instructions.
static bool isSEHInstruction(const MachineInstr &MI)
Return true if the instructions is a SEH instruciton used for unwinding on Windows.
bool isReservedReg(const MachineFunction &MF, MCRegister Reg) const
bool hasBasePointer(const MachineFunction &MF) const
bool cannotEliminateFrame(const MachineFunction &MF) const
const AArch64RegisterInfo * getRegisterInfo() const override
const AArch64InstrInfo * getInstrInfo() const override
const AArch64TargetLowering * getTargetLowering() const override
const Triple & getTargetTriple() const
bool isCallingConvWin64(CallingConv::ID CC) const
const char * getChkStkName() const
bool swiftAsyncContextIsDynamicallySet() const
Return whether FrameLowering should always set the "extended frame present" bit in FP,...
bool hasInlineStackProbe(const MachineFunction &MF) const override
True if stack clash protection is enabled for this functions.
unsigned getRedZoneSize(const Function &F) const
bool supportSwiftError() const override
Return true if the target supports swifterror attribute.
ArrayRef - Represent a constant reference to an array (0 or more elements consecutively in memory),...
Definition: ArrayRef.h:41
size_t size() const
size - Get the array size.
Definition: ArrayRef.h:165
bool empty() const
empty - Check if the array is empty.
Definition: ArrayRef.h:160
bool hasAttrSomewhere(Attribute::AttrKind Kind, unsigned *Index=nullptr) const
Return true if the specified attribute is set for at least one parameter or for the return value.
bool test(unsigned Idx) const
Definition: BitVector.h:461
BitVector & reset()
Definition: BitVector.h:392
size_type count() const
count - Returns the number of bits which are set.
Definition: BitVector.h:162
BitVector & set()
Definition: BitVector.h:351
iterator_range< const_set_bits_iterator > set_bits() const
Definition: BitVector.h:140
A debug info location.
Definition: DebugLoc.h:33
bool hasOptSize() const
Optimize this function for size (-Os) or minimum size (-Oz).
Definition: Function.h:680
bool hasMinSize() const
Optimize this function for minimum size (-Oz).
Definition: Function.h:677
CallingConv::ID getCallingConv() const
getCallingConv()/setCallingConv(CC) - These method get and set the calling convention of this functio...
Definition: Function.h:262
AttributeList getAttributes() const
Return the attribute list for this Function.
Definition: Function.h:338
bool hasFnAttribute(Attribute::AttrKind Kind) const
Return true if the function has the attribute.
Definition: Function.cpp:677
void copyPhysReg(MachineBasicBlock &MBB, MachineBasicBlock::iterator I, const DebugLoc &DL, MCRegister DestReg, MCRegister SrcReg, bool KillSrc) const override
Emit instructions to copy a pair of physical registers.
A set of physical registers with utility functions to track liveness when walking backward/forward th...
Definition: LivePhysRegs.h:50
bool available(const MachineRegisterInfo &MRI, MCPhysReg Reg) const
Returns true if register Reg and no aliasing register is in the set.
void stepBackward(const MachineInstr &MI)
Simulates liveness when stepping backwards over an instruction(bundle).
void removeReg(MCPhysReg Reg)
Removes a physical register, all its sub-registers, and all its super-registers from the set.
Definition: LivePhysRegs.h:90
void addLiveIns(const MachineBasicBlock &MBB)
Adds all live-in registers of basic block MBB.
void addLiveOuts(const MachineBasicBlock &MBB)
Adds all live-out registers of basic block MBB.
void addReg(MCPhysReg Reg)
Adds a physical register and all its sub-registers to the set.
Definition: LivePhysRegs.h:81
bool usesWindowsCFI() const
Definition: MCAsmInfo.h:799
static MCCFIInstruction createDefCfaRegister(MCSymbol *L, unsigned Register, SMLoc Loc={})
.cfi_def_cfa_register modifies a rule for computing CFA.
Definition: MCDwarf.h:548
static MCCFIInstruction createOffset(MCSymbol *L, unsigned Register, int Offset, SMLoc Loc={})
.cfi_offset Previous value of Register is saved at offset Offset from CFA.
Definition: MCDwarf.h:583
static MCCFIInstruction cfiDefCfaOffset(MCSymbol *L, int Offset, SMLoc Loc={})
.cfi_def_cfa_offset modifies a rule for computing CFA.
Definition: MCDwarf.h:556
static MCCFIInstruction createRestore(MCSymbol *L, unsigned Register, SMLoc Loc={})
.cfi_restore says that the rule for Register is now the same as it was at the beginning of the functi...
Definition: MCDwarf.h:616
static MCCFIInstruction createNegateRAState(MCSymbol *L, SMLoc Loc={})
.cfi_negate_ra_state AArch64 negate RA state.
Definition: MCDwarf.h:609
static MCCFIInstruction cfiDefCfa(MCSymbol *L, unsigned Register, int Offset, SMLoc Loc={})
.cfi_def_cfa defines a rule for computing CFA as: take address from Register and add Offset to it.
Definition: MCDwarf.h:541
static MCCFIInstruction createEscape(MCSymbol *L, StringRef Vals, SMLoc Loc={}, StringRef Comment="")
.cfi_escape Allows the user to add arbitrary bytes to the unwind info.
Definition: MCDwarf.h:647
static MCCFIInstruction createSameValue(MCSymbol *L, unsigned Register, SMLoc Loc={})
.cfi_same_value Current value of Register is the same as in the previous frame.
Definition: MCDwarf.h:630
MCSymbol * createTempSymbol()
Create a temporary symbol with a unique name.
Definition: MCContext.cpp:321
Describe properties that are true of each instruction in the target description file.
Definition: MCInstrDesc.h:198
Wrapper class representing physical registers. Should be passed by value.
Definition: MCRegister.h:33
MCSymbol - Instances of this class represent a symbol name in the MC file, and MCSymbols are created ...
Definition: MCSymbol.h:40
void transferSuccessorsAndUpdatePHIs(MachineBasicBlock *FromMBB)
Transfers all the successors, as in transferSuccessors, and update PHI operands in the successor bloc...
instr_iterator instr_begin()
const BasicBlock * getBasicBlock() const
Return the LLVM basic block that this instance corresponded to originally.
bool isLiveIn(MCPhysReg Reg, LaneBitmask LaneMask=LaneBitmask::getAll()) const
Return true if the specified register is in the live in set.
bool isEHFuncletEntry() const
Returns true if this is the entry block of an EH funclet.
iterator getFirstTerminator()
Returns an iterator to the first terminator instruction of this basic block.
MachineInstr & instr_back()
void addSuccessor(MachineBasicBlock *Succ, BranchProbability Prob=BranchProbability::getUnknown())
Add Succ as a successor of this MachineBasicBlock.
DebugLoc findDebugLoc(instr_iterator MBBI)
Find the next valid DebugLoc starting at MBBI, skipping any debug instructions.
iterator getLastNonDebugInstr(bool SkipPseudoOp=true)
Returns an iterator to the last non-debug instruction in the basic block, or end().
instr_iterator instr_end()
void addLiveIn(MCRegister PhysReg, LaneBitmask LaneMask=LaneBitmask::getAll())
Adds the specified register as a live in.
const MachineFunction * getParent() const
Return the MachineFunction containing this basic block.
instr_iterator erase(instr_iterator I)
Remove an instruction from the instruction list and delete it.
reverse_iterator rbegin()
iterator insertAfter(iterator I, MachineInstr *MI)
Insert MI into the instruction list after I.
void splice(iterator Where, MachineBasicBlock *Other, iterator From)
Take an instruction from MBB 'Other' at the position From, and insert it into this MBB right before '...
The MachineFrameInfo class represents an abstract stack frame until prolog/epilog code is inserted.
int CreateFixedObject(uint64_t Size, int64_t SPOffset, bool IsImmutable, bool isAliased=false)
Create a new object at a fixed location on the stack.
bool hasVarSizedObjects() const
This method may be called any time after instruction selection is complete to determine if the stack ...
uint64_t getStackSize() const
Return the number of bytes that must be allocated to hold all of the fixed size frame objects.
int CreateStackObject(uint64_t Size, Align Alignment, bool isSpillSlot, const AllocaInst *Alloca=nullptr, uint8_t ID=0)
Create a new statically sized stack object, returning a nonnegative identifier to represent it.
bool hasCalls() const
Return true if the current function has any function calls.
bool isFrameAddressTaken() const
This method may be called any time after instruction selection is complete to determine if there is a...
Align getMaxAlign() const
Return the alignment in bytes that this function must be aligned to, which is greater than the defaul...
void setObjectOffset(int ObjectIdx, int64_t SPOffset)
Set the stack frame offset of the specified object.
bool hasPatchPoint() const
This method may be called any time after instruction selection is complete to determine if there is a...
int getStackProtectorIndex() const
Return the index for the stack protector object.
uint64_t estimateStackSize(const MachineFunction &MF) const
Estimate and return the size of the stack frame.
void setStackID(int ObjectIdx, uint8_t ID)
bool isCalleeSavedInfoValid() const
Has the callee saved info been calculated yet?
Align getObjectAlign(int ObjectIdx) const
Return the alignment of the specified stack object.
int64_t getObjectSize(int ObjectIdx) const
Return the size of the specified object.
bool isMaxCallFrameSizeComputed() const
bool hasStackMap() const
This method may be called any time after instruction selection is complete to determine if there is a...
const std::vector< CalleeSavedInfo > & getCalleeSavedInfo() const
Returns a reference to call saved info vector for the current function.
unsigned getMaxCallFrameSize() const
Return the maximum size of a call frame that must be allocated for an outgoing function call.
int getObjectIndexEnd() const
Return one past the maximum frame object index.
bool hasStackProtectorIndex() const
uint8_t getStackID(int ObjectIdx) const
int64_t getObjectOffset(int ObjectIdx) const
Return the assigned stack offset of the specified object from the incoming stack pointer.
int getObjectIndexBegin() const
Return the minimum frame object index.
bool isDeadObjectIndex(int ObjectIdx) const
Returns true if the specified index corresponds to a dead object.
const WinEHFuncInfo * getWinEHFuncInfo() const
getWinEHFuncInfo - Return information about how the current function uses Windows exception handling.
unsigned addFrameInst(const MCCFIInstruction &Inst)
const TargetSubtargetInfo & getSubtarget() const
getSubtarget - Return the subtarget for which this machine code is being compiled.
MachineMemOperand * getMachineMemOperand(MachinePointerInfo PtrInfo, MachineMemOperand::Flags f, LLT MemTy, Align base_alignment, const AAMDNodes &AAInfo=AAMDNodes(), const MDNode *Ranges=nullptr, SyncScope::ID SSID=SyncScope::System, AtomicOrdering Ordering=AtomicOrdering::NotAtomic, AtomicOrdering FailureOrdering=AtomicOrdering::NotAtomic)
getMachineMemOperand - Allocate a new MachineMemOperand.
MachineFrameInfo & getFrameInfo()
getFrameInfo - Return the frame info object for the current function.
MachineRegisterInfo & getRegInfo()
getRegInfo - Return information about the registers currently in use.
Function & getFunction()
Return the LLVM function that this machine code represents.
const LLVMTargetMachine & getTarget() const
getTarget - Return the target machine this machine code is compiled with
MachineModuleInfo & getMMI() const
Ty * getInfo()
getInfo - Keep track of various per-function pieces of information for backends that would like to do...
const MachineBasicBlock & front() const
MachineBasicBlock * CreateMachineBasicBlock(const BasicBlock *BB=nullptr, std::optional< UniqueBBID > BBID=std::nullopt)
CreateMachineBasicBlock - Allocate a new MachineBasicBlock.
void insert(iterator MBBI, MachineBasicBlock *MBB)
const MachineInstrBuilder & setMemRefs(ArrayRef< MachineMemOperand * > MMOs) const
const MachineInstrBuilder & addExternalSymbol(const char *FnName, unsigned TargetFlags=0) const
const MachineInstrBuilder & addCFIIndex(unsigned CFIIndex) const
const MachineInstrBuilder & setMIFlag(MachineInstr::MIFlag Flag) const
const MachineInstrBuilder & addImm(int64_t Val) const
Add a new immediate operand.
const MachineInstrBuilder & add(const MachineOperand &MO) const
const MachineInstrBuilder & addFrameIndex(int Idx) const
const MachineInstrBuilder & addReg(Register RegNo, unsigned flags=0, unsigned SubReg=0) const
Add a new virtual register operand.
const MachineInstrBuilder & addMBB(MachineBasicBlock *MBB, unsigned TargetFlags=0) const
const MachineInstrBuilder & addUse(Register RegNo, unsigned Flags=0, unsigned SubReg=0) const
Add a virtual register use operand.
const MachineInstrBuilder & setMIFlags(unsigned Flags) const
const MachineInstrBuilder & addMemOperand(MachineMemOperand *MMO) const
const MachineInstrBuilder & addDef(Register RegNo, unsigned Flags=0, unsigned SubReg=0) const
Add a virtual register definition operand.
Representation of each machine instruction.
Definition: MachineInstr.h:69
void setFlags(unsigned flags)
Definition: MachineInstr.h:392
void eraseFromParent()
Unlink 'this' from the containing basic block and delete it.
uint32_t getFlags() const
Return the MI flags bitvector.
Definition: MachineInstr.h:374
@ MOLoad
The memory access reads data.
@ MOStore
The memory access writes data.
This class contains meta information specific to a module.
const MCContext & getContext() const
MachineOperand class - Representation of each machine instruction operand.
void setImm(int64_t immVal)
int64_t getImm() const
static MachineOperand CreateImm(int64_t Val)
bool isFI() const
isFI - Tests if this is a MO_FrameIndex operand.
MachineRegisterInfo - Keep track of information for virtual and physical registers,...
Register createVirtualRegister(const TargetRegisterClass *RegClass, StringRef Name="")
createVirtualRegister - Create and return a new virtual register in the function with the specified r...
bool isLiveIn(Register Reg) const
const MCPhysReg * getCalleeSavedRegs() const
Returns list of callee saved registers.
bool isPhysRegUsed(MCRegister PhysReg, bool SkipRegMaskTest=false) const
Return true if the specified register is modified or read in this function.
MutableArrayRef - Represent a mutable reference to an array (0 or more elements consecutively in memo...
Definition: ArrayRef.h:307
void enterBasicBlockEnd(MachineBasicBlock &MBB)
Start tracking liveness from the end of basic block MBB.
Register FindUnusedReg(const TargetRegisterClass *RC) const
Find an unused register of the specified register class.
void backward()
Update internal register state and move MBB iterator backwards.
void addScavengingFrameIndex(int FI)
Add a scavenging frame index.
Wrapper class representing virtual and physical registers.
Definition: Register.h:19
bool empty() const
Definition: SmallVector.h:94
size_t size() const
Definition: SmallVector.h:91
This class consists of common code factored out of the SmallVector class to reduce code duplication b...
Definition: SmallVector.h:586
reference emplace_back(ArgTypes &&... Args)
Definition: SmallVector.h:950
void append(ItTy in_start, ItTy in_end)
Add the specified range to the end of the SmallVector.
Definition: SmallVector.h:696
void push_back(const T &Elt)
Definition: SmallVector.h:426
This is a 'vector' (really, a variable-sized array), optimized for the case when the array is small.
Definition: SmallVector.h:1209
StackOffset holds a fixed and a scalable offset in bytes.
Definition: TypeSize.h:33
int64_t getFixed() const
Returns the fixed component of the stack.
Definition: TypeSize.h:49
int64_t getScalable() const
Returns the scalable component of the stack.
Definition: TypeSize.h:52
static StackOffset get(int64_t Fixed, int64_t Scalable)
Definition: TypeSize.h:44
static StackOffset getScalable(int64_t Scalable)
Definition: TypeSize.h:43
static StackOffset getFixed(int64_t Fixed)
Definition: TypeSize.h:42
StringRef - Represent a constant reference to a string, i.e.
Definition: StringRef.h:50
virtual void determineCalleeSaves(MachineFunction &MF, BitVector &SavedRegs, RegScavenger *RS=nullptr) const
This method determines which of the registers reported by TargetRegisterInfo::getCalleeSavedRegs() sh...
int getOffsetOfLocalArea() const
getOffsetOfLocalArea - This method returns the offset of the local area from the stack pointer on ent...
Align getStackAlign() const
getStackAlignment - This method returns the number of bytes to which the stack pointer must be aligne...
StackDirection getStackGrowthDirection() const
getStackGrowthDirection - Return the direction the stack grows
virtual bool enableCFIFixup(MachineFunction &MF) const
Returns true if we may need to fix the unwind information for the function.
TargetInstrInfo - Interface to description of machine instruction set.
TargetOptions Options
CodeModel::Model getCodeModel() const
Returns the code model.
const MCAsmInfo * getMCAsmInfo() const
Return target specific asm information.
SwiftAsyncFramePointerMode SwiftAsyncFramePointer
Control when and how the Swift async frame pointer bit should be set.
bool DisableFramePointerElim(const MachineFunction &MF) const
DisableFramePointerElim - This returns true if frame pointer elimination optimization should be disab...
TargetRegisterInfo base class - We assume that the target defines a static array of TargetRegisterDes...
const TargetRegisterClass * getMinimalPhysRegClass(MCRegister Reg, MVT VT=MVT::Other) const
Returns the Register Class of a physical register of the given type, picking the most sub register cl...
Align getSpillAlign(const TargetRegisterClass &RC) const
Return the minimum required alignment in bytes for a spill slot for a register of this class.
bool hasStackRealignment(const MachineFunction &MF) const
True if stack realignment is required and still possible.
unsigned getSpillSize(const TargetRegisterClass &RC) const
Return the size in bytes of the stack slot allocated to hold a spilled copy of a register from class ...
TargetSubtargetInfo - Generic base class for all target subtargets.
virtual const TargetRegisterInfo * getRegisterInfo() const
getRegisterInfo - If register information is available, return it.
virtual const TargetInstrInfo * getInstrInfo() const
StringRef getArchName() const
Get the architecture (first) component of the triple.
Definition: Triple.cpp:1189
static constexpr TypeSize getFixed(ScalarTy ExactSize)
Definition: TypeSize.h:330
The instances of the Type class are immutable: once they are created, they are never changed.
Definition: Type.h:45
self_iterator getIterator()
Definition: ilist_node.h:109
#define llvm_unreachable(msg)
Marks that the current location is not supposed to be reachable.
@ MO_GOT
MO_GOT - This flag indicates that a symbol operand represents the address of the GOT entry for the sy...
static unsigned getShiftValue(unsigned Imm)
getShiftValue - Extract the shift value.
static unsigned getArithExtendImm(AArch64_AM::ShiftExtendType ET, unsigned Imm)
getArithExtendImm - Encode the extend type and shift amount for an arithmetic instruction: imm: 3-bit...
static uint64_t encodeLogicalImmediate(uint64_t imm, unsigned regSize)
encodeLogicalImmediate - Return the encoded immediate value for a logical immediate instruction of th...
static unsigned getShifterImm(AArch64_AM::ShiftExtendType ST, unsigned Imm)
getShifterImm - Encode the shift type and amount: imm: 6-bit shift amount shifter: 000 ==> lsl 001 ==...
const unsigned StackProbeMaxLoopUnroll
Maximum number of iterations to unroll for a constant size probing loop.
const unsigned StackProbeMaxUnprobedStack
Maximum allowed number of unprobed bytes above SP at an ABI boundary.
@ PreserveMost
Used for runtime calls that preserves most registers.
Definition: CallingConv.h:63
@ CXX_FAST_TLS
Used for access functions.
Definition: CallingConv.h:72
@ GHC
Used by the Glasgow Haskell Compiler (GHC).
Definition: CallingConv.h:50
@ PreserveAll
Used for runtime calls that preserves (almost) all registers.
Definition: CallingConv.h:66
@ Win64
The C convention as implemented on Windows/x86-64 and AArch64.
Definition: CallingConv.h:159
@ SwiftTail
This follows the Swift calling convention in how arguments are passed but guarantees tail calls will ...
Definition: CallingConv.h:87
@ Implicit
Not emitted register (e.g. carry, or temporary result).
@ Dead
Unused definition.
@ Define
Register definition.
@ Kill
The last use of a register.
Reg
All possible values of the reg field in the ModR/M byte.
initializer< Ty > init(const Ty &Val)
Definition: CommandLine.h:450
NodeAddr< InstrNode * > Instr
Definition: RDFGraph.h:389
This is an optimization pass for GlobalISel generic memory operations.
Definition: AddressRanges.h:18
@ Offset
Definition: DWP.cpp:456
void stable_sort(R &&Range)
Definition: STLExtras.h:2004
MCCFIInstruction createDefCFA(const TargetRegisterInfo &TRI, unsigned FrameReg, unsigned Reg, const StackOffset &Offset, bool LastAdjustmentWasScalable=true)
MachineInstrBuilder BuildMI(MachineFunction &MF, const MIMetadata &MIMD, const MCInstrDesc &MCID)
Builder interface. Specify how to create the initial instruction itself.
int isAArch64FrameOffsetLegal(const MachineInstr &MI, StackOffset &Offset, bool *OutUseUnscaledOp=nullptr, unsigned *OutUnscaledOp=nullptr, int64_t *EmittableOffset=nullptr)
Check if the Offset is a valid frame offset for MI.
detail::scope_exit< std::decay_t< Callable > > make_scope_exit(Callable &&F)
Definition: ScopeExit.h:59
MCCFIInstruction createCFAOffset(const TargetRegisterInfo &MRI, unsigned Reg, const StackOffset &OffsetFromDefCFA)
iterator_range< T > make_range(T x, T y)
Convenience function for iterating over sub-ranges.
unsigned getBLRCallOpcode(const MachineFunction &MF)
Return opcode to be used for indirect calls.
@ AArch64FrameOffsetCannotUpdate
Offset cannot apply.
auto reverse(ContainerTy &&C)
Definition: STLExtras.h:428
@ Always
Always set the bit.
@ DeploymentBased
Determine whether to set the bit statically or dynamically based on the deployment target.
raw_ostream & dbgs()
dbgs() - This returns a reference to a raw_ostream for debugging messages.
Definition: Debug.cpp:163
void emitFrameOffset(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI, const DebugLoc &DL, unsigned DestReg, unsigned SrcReg, StackOffset Offset, const TargetInstrInfo *TII, MachineInstr::MIFlag=MachineInstr::NoFlags, bool SetNZCV=false, bool NeedsWinCFI=false, bool *HasWinCFI=nullptr, bool EmitCFAOffset=false, StackOffset InitialOffset={}, unsigned FrameReg=AArch64::SP)
emitFrameOffset - Emit instructions as needed to set DestReg to SrcReg plus Offset.
void report_fatal_error(Error Err, bool gen_crash_diag=true)
Report a serious error, calling any installed error handler.
Definition: Error.cpp:156
EHPersonality classifyEHPersonality(const Value *Pers)
See if the given exception handling personality function is one that we understand.
unsigned getDefRegState(bool B)
unsigned getKillRegState(bool B)
uint64_t alignTo(uint64_t Size, Align A)
Returns a multiple of A needed to store Size bytes.
Definition: Alignment.h:155
bool isAsynchronousEHPersonality(EHPersonality Pers)
Returns true if this personality function catches asynchronous exceptions.
Printable printReg(Register Reg, const TargetRegisterInfo *TRI=nullptr, unsigned SubIdx=0, const MachineRegisterInfo *MRI=nullptr)
Prints virtual and physical registers with or without a TRI instance.
static bool recomputeLiveIns(MachineBasicBlock &MBB)
Convenience function for recomputing live-in's for a MBB.
Definition: LivePhysRegs.h:198
void swap(llvm::BitVector &LHS, llvm::BitVector &RHS)
Implement std::swap in terms of BitVector swap.
Definition: BitVector.h:860
This struct is a compact representation of a valid (non-zero power of two) alignment.
Definition: Alignment.h:39
uint64_t value() const
This is a hole in the type system and should not be abused.
Definition: Alignment.h:85
Description of the encoding of one expression Op.
static MachinePointerInfo getFixedStack(MachineFunction &MF, int FI, int64_t Offset=0)
Return a MachinePointerInfo record that refers to the specified FrameIndex.