Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

__builtin_ia32_readeflags_u64() unnecessarily forces a frame pointer #46875

Closed
llvmbot opened this issue Sep 14, 2020 · 15 comments
Closed

__builtin_ia32_readeflags_u64() unnecessarily forces a frame pointer #46875

llvmbot opened this issue Sep 14, 2020 · 15 comments
Assignees
Labels

Comments

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 14, 2020

Bugzilla Link 47531
Version trunk
OS Linux
Blocks #4440
Reporter LLVM Bugzilla Contributor
CC @topperc,@majnemer,@RKSimon,@nickdesaulniers,@zygoloid,@rnk,@rotateright,@tstellar

Extended Description

This code:

unsigned long read_eflags(void)
{
   return __builtin_ia32_readeflags_u64();
}

compiles to:

read_eflags:
        pushq   %rbp
        movq    %rsp, %rbp
        pushfq
        popq    %rax
        popq    %rbp
        retq

which has an unnecessary frame pointer. (I'm somewhat puzzled as to why this happened. Sure, the redzone slot clobbered by the pushfq can't be used, but I don't see what this has to do with a frame pointer.)

gcc generates:

read_eflags:
        pushfq
        popq    %rax
        ret
@llvmbot
Copy link
Collaborator Author

llvmbot commented Sep 14, 2020

assigned to @nickdesaulniers

@nickdesaulniers
Copy link
Member

Looks like:
llvm-project/llvm/lib/Target/X86/X86ISelLowering.cpp:25858

case llvm::Intrinsic::x86_flags_read_u32:
case llvm::Intrinsic::x86_flags_read_u64:
case llvm::Intrinsic::x86_flags_write_u32:
case llvm::Intrinsic::x86_flags_write_u64: {
  // We need a frame pointer because this will get lowered to a PUSH/POP
  // sequence.
  MachineFrameInfo &MFI = DAG.getMachineFunction().getFrameInfo();
  MFI.setHasCopyImplyingStackAdjustment(true);
  // Don't do anything here, we will expand these intrinsics out later
  // during FinalizeISel in EmitInstrWithCustomInserter.
  return Op;
}

added by 861a0ae. Commenting out that MFI operation fixes the test case in question. Doing so will cause test failures in:

LLVM :: CodeGen/X86/win64_frame.ll
LLVM :: CodeGen/X86/x86-64-flags-intrinsics.ll
LLVM :: CodeGen/X86/x86-flags-intrinsics.ll

I don't exactly follow the comment in the source. Why does push/pop require a frame pointer? Also, the tests seem to all use windows target triple; is there some requirement of the Windows ABI there?

@topperc
Copy link
Collaborator

topperc commented Sep 18, 2020

Here are some older comments from before some other EFLAGS copying code was removed in 19618fc

-/// This function checks if any of the users of EFLAGS copies the EFLAGS. We
-/// know that the code that lowers COPY of EFLAGS has to use the stack, and if
-/// we don't adjust the stack we clobber the first frame index.
-/// See X86InstrInfo::copyPhysReg.
-static bool hasCopyImplyingStackAdjustment(const MachineFunction &MF) {
-  const MachineRegisterInfo &MRI = MF.getRegInfo();
-  return any_of(MRI.reg_instructions(X86::EFLAGS),
-                [](const MachineInstr &RI) { return RI.isCopy(); });
-}
-
-void X86TargetLowering::finalizeLowering(MachineFunction &MF) const {
-  if (hasCopyImplyingStackAdjustment(MF)) {
-    MachineFrameInfo &MFI = MF.getFrameInfo();
-    MFI.setHasCopyImplyingStackAdjustment(true);
-  }
-
-  TargetLoweringBase::finalizeLowering(MF);
-}
-

@rnk
Copy link
Collaborator

rnk commented Sep 21, 2020

At the time, Jan 2016, David was working on Windows EH stuff, and I think the main concern was that if you have PUSHF or POPF instructions in a function, the stack unwind info will be incorrect after that instruction. This is true regardless of whether DWARF or Windows unwind info is in use. We had just finished going to great lengths to making the Windows unwind info accurate at every instruction boundary, and this may have represented a bug to David. Writing flags is always super slow, so we weren't really concerned about introducing extra overhead for it. I can see the argument that users may not care about perfect unwind info and may prefer to read flags more quickly.

@nickdesaulniers
Copy link
Member

@nickdesaulniers
Copy link
Member

Noting: see also: #46874

@llvmbot
Copy link
Collaborator Author

llvmbot commented Dec 6, 2020

https://reviews.llvm.org/D92695

I don't have a Phabricator account, but maybe a comment here will still be useful.

As you all discovered, on x86_64, a kernel essentially can't use a red-zone. [0] That being said, I would hope that LLVM would be sensible enough to generate correct code regardless of command line options. For example, the eflags builtins could disable the redzone in the function in which they're used, or they could be even more clever and reserve the top redzone slot for themselves so that the rest of the redzone could be used.

But I bet that they will be much more common in kernel code than user code. Kernel code can and does manipulate various interesting EFLAGS bits that can't or mostly wouldn't be written from user mode, in particular IF and AC. The only reason I can think of for using this builtins from 64-bit user mode is to manipulate the EFLAGS.TF bit for single-stepping. The RF bit is also interesting but IIRC can't be written using POPF in any useful way, and the arithmetic flags bits don't make any sense to manipulate like this in C code. Authors of test cases that abuse the kernel or of DRM schemes might play with EFLAGS.TS, and maybe someone wants to fiddle with EFLAGS.AC, but those would be quite rare. (32-bit user mode intended to work on very old CPUs uses the EFLAGS.ID to determine whether the CPUID instruction exists. I doubt anyone cares about the performance of such code.)

[0] It's techincally possible but it's horrible for all kinds of reasons. A user program that cooperates with the kernel to get direct-to-usermode delivery of interrupts also can't use a red-zone, but no one except exploit authors would do this. There are hardware features^Wbugs that have historically been interesting attacks against hypervisors that involve programming the CPU like this, but no one does it for legitimate reasons.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
@nickdesaulniers
Copy link
Member

see also #20571.

@llvmbot llvmbot added the confirmed Verified by a second party label Jan 26, 2022
@nickdesaulniers
Copy link
Member

cc @gwelymernans

@nickdesaulniers nickdesaulniers removed their assignment Feb 8, 2022
@nickdesaulniers
Copy link
Member

@nickdesaulniers
Copy link
Member

@tstellar can we backport f3481f4 into the 14.x release?

@tstellar
Copy link
Collaborator

/cherry-pick f3481f4

@llvmbot
Copy link
Collaborator Author

llvmbot commented Feb 11, 2022

/branch llvmbot/llvm-project/issue46875

llvmbot pushed a commit to llvmbot/llvm-project that referenced this issue Feb 11, 2022
This ensures that the Windows unwinder will work at every instruction
boundary, and allows other targets to read and write flags without
setting up a frame pointer.

Fixes llvmGH-46875

Differential Revision: https://reviews.llvm.org/D119391

(cherry picked from commit f3481f4)
@llvmbot
Copy link
Collaborator Author

llvmbot commented Feb 11, 2022

/pull-request llvmbot#42

tstellar pushed a commit to llvmbot/llvm-project that referenced this issue Feb 14, 2022
This ensures that the Windows unwinder will work at every instruction
boundary, and allows other targets to read and write flags without
setting up a frame pointer.

Fixes llvmGH-46875

Differential Revision: https://reviews.llvm.org/D119391

(cherry picked from commit f3481f4)
@tstellar
Copy link
Collaborator

Merged: 66c59c0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants