AArch64 debug problems on Windows #51015

llvmbot · 2021-08-30T13:44:39Z


Bugzilla Link	51673
Resolution	FIXED
Resolved on	Sep 22, 2021 04:14
Version	10.0
OS	Windows NT
Reporter	LLVM Bugzilla Contributor
CC	@JDevlieghere,@mstorsjo
Fixed by commit(s)	`9f34f75`

Extended Description

I was using lldb on Windows (Raspberry Pi 4 - ARM Cortex-A72 processor), and there were a couple of x86-isms that persist that make it impossible to set and use breakpoints.

Platform::GetSoftwareBreakpointTrapOpcode()
NativeProcessProtocol::GetSoftwareBreakpointTrapOpcode()
Uses "0xd4 0x20 0x00 0x00" for breakpoint on aarch64, but this does not work on Windows as it fails with a STATUS_ILLEGAL_INSTRUCTION exception being thrown.

case llvm::Triple::aarch64: {
static const uint8_t g_aarch64_opcode[] = {0x00, 0x00, 0x20, 0xd4};

The compiler intrinsic __debug_break() generates "{0x00, 0x00, 0x3e, 0xd4}". If I instead use this, then the program stops at the requested breakpoint, as expected.

TargetThreadWindows::DoResume()
NativeThreadWindows::DoResume()
Sets flag 0x100 for single step, but that's only valid for x86/x64.

if (resume_state == eStateStepping) {
uint32_t flags_index =
GetRegisterContext()->ConvertRegisterKindToRegisterNumber(
eRegisterKindGeneric, LLDB_REGNUM_GENERIC_FLAGS);
uint64_t flags_value =
GetRegisterContext()->ReadRegisterAsUnsigned(flags_index, 0);
flags_value |= 0x100; // Set the trap flag on the CPU /* only correct for x86/x64 */
GetRegisterContext()->WriteRegisterFromUnsigned(flags_index, flags_value);
}

For AArch64, it should instead be 0x200000, as this is the location of the 'SS' bit of the PState register. Using this value, then single stepping works as expected.

Without a fix in this area, single stepping simply doesn't work.

ProcessWindows::RefreshStateAfterStop()
When handling EXCEPTION_BREAKPOINT, it assumes that the breakpoint instruction is 1 byte, which is only correct for x86/x64.

// The current EIP is AFTER the BP opcode, which is one byte.
uint64_t pc = register_context->GetPC() - 1;

The basic theory here is that after a breakpoint exception, that the program counter points to the instruction after the breakpoint, so we need to back up one instruction to get to the real breakpoint. On x86, a breakpoint is a single byte 0xCC. For AArch64 it needs to be 4 bytes (as noted above). Other architectures are going to be different of course. One can temporarily tweak this so that it works, of course, but on the surface it seems like the breakpoint information is already encapsulated in Platform.cpp, I suppose it would be best to simply leverage that.

Note that I am currently working in a LLVM 10.0 source tree, but I looked ahead to version 12 and I see no changes in this area.

I should add that with fixes/tweaks in all 3 of these places, the AArch64 lldb actually behaves pretty normally. I am still having a little trouble with PDB files, but I can address that separately.

The text was updated successfully, but these errors were encountered:

mstorsjo · 2021-08-30T16:31:10Z

Thanks for analyzing these issues! I've noted similar things (and users have reported it to me at mstorsjo/llvm-mingw#198) but haven't had time to look into it properly yet.

Do you happen to have patches for these issues and/or are you planning on submitting them at http://reviews.llvm.org, or do you want me to take them from here?

llvmbot · 2021-08-30T17:28:52Z

The changes I have are pretty rough right now - enough to prove the nature of the problem, and also to demonstrate that lldb is actually pretty functional with those changes. But at the moment they aren't clean enough to submit.

If you could take it from here, I would be fine with that.

mstorsjo · 2021-08-30T19:30:29Z

I was using lldb on Windows (Raspberry Pi 4 - ARM Cortex-A72 processor), and
there were a couple of x86-isms that persist that make it impossible to set
and use breakpoints.

Platform::GetSoftwareBreakpointTrapOpcode()
NativeProcessProtocol::GetSoftwareBreakpointTrapOpcode()
Uses "0xd4 0x20 0x00 0x00" for breakpoint on aarch64, but this does not work
on Windows as it fails with a STATUS_ILLEGAL_INSTRUCTION exception being
thrown.

case llvm::Triple::aarch64: {
static const uint8_t g_aarch64_opcode[] = {0x00, 0x00, 0x20, 0xd4};

The compiler intrinsic __debug_break() generates "{0x00, 0x00, 0x3e, 0xd4}".
If I instead use this, then the program stops at the requested breakpoint,
as expected.

FWIW, I think that the exact opcode to use for debug break is platform dependent, and the existing fallback value is what's used on Linux. The current opcode corresponds to the instruction "brk #0", while the Windows specific value indeed is "brk #0xf000", which assembles to the bytes you're suggesting.

So I guess the right course of action here is to override GetSoftwareBreakpointTrapOpcode() in a windows specific subclass and handle aarch64 there.

llvmbot · 2021-08-30T19:59:59Z

That was my gut feel as well - a need to have an OS-dependent breakpoint. I assumed that the current value must have been valid somewhere, but I don't have enough context to say one way or another.

Looking in AArch64InstrInfo.td, it basically says that "brk 0xf000" is used for a trap on Windows, and something different everywhere else.

I didn't even try ARM32. That's a whole different can of worms.

mstorsjo · 2021-08-30T20:17:33Z

I didn't even try ARM32. That's a whole different can of worms.

Indeed. Wrt breakpoints, they can be either 2 or 4 byte depending on the size of the instruction they're modifying.

I made an effort to bring arm32 up to an equally usable state, but I had to leave one rather involved patch. As thumb addresses can have the lowest bit set, address calculations based on such addresses need to strip out that bit, otherwise e.g. the dwarf line number mapping (for mingw mode code) gets desynced, see https://reviews.llvm.org/D70840.

Still regarding aarch64, I tried building lldb from the 13.x with these modifications:

diff --git a/lldb/source/Host/common/NativeProcessProtocol.cpp b/lldb/source/Host/common/NativeProcessProtocol.cpp
index ea80a05430f7..ad5ed9746e5d 100644
--- a/lldb/source/Host/common/NativeProcessProtocol.cpp
+++ b/lldb/source/Host/common/NativeProcessProtocol.cpp
@@ -485,7 +485,7 @@ NativeProcessProtocol::EnableSoftwareBreakpoint(lldb::addr_t addr,

llvm::Expected<llvm::ArrayRef<uint8_t>>
NativeProcessProtocol::GetSoftwareBreakpointTrapOpcode(size_t size_hint) {

static const uint8_t g_aarch64_opcode[] = {0x00, 0x00, 0x20, 0xd4};

static const uint8_t g_aarch64_opcode[] = {0x00, 0x00, 0x3e, 0xd4};
static const uint8_t g_i386_opcode[] = {0xCC};
static const uint8_t g_mips64_opcode[] = {0x00, 0x00, 0x00, 0x0d};
static const uint8_t g_mips64el_opcode[] = {0x0d, 0x00, 0x00, 0x00};
diff --git a/lldb/source/Plugins/Process/Windows/Common/NativeThreadWindows.cpp b/lldb/source/Plugins/Process/Windows/Common/NativeThreadWindows.cpp
index 16e67bb9f863..43e3a8689315 100644
--- a/lldb/source/Plugins/Process/Windows/Common/NativeThreadWindows.cpp
+++ b/lldb/source/Plugins/Process/Windows/Common/NativeThreadWindows.cpp
@@ -53,7 +53,7 @@ Status NativeThreadWindows::DoResume(lldb::StateType resume_state) {
eRegisterKindGeneric, LLDB_REGNUM_GENERIC_FLAGS);
uint64_t flags_value =
GetRegisterContext().ReadRegisterAsUnsigned(flags_index, 0);

flags_value |= 0x100; // Set the trap flag on the CPU

flags_value |= 0x200000; // Set the trap flag on the CPU
GetRegisterContext().WriteRegisterFromUnsigned(flags_index, flags_value);
}

diff --git a/lldb/source/Plugins/Process/Windows/Common/ProcessWindows.cpp b/lldb/source/Plugins/Process/Windows/Common/ProcessWindows.cpp
index 379496ba0d54..af1c9d78ebaf 100644
--- a/lldb/source/Plugins/Process/Windows/Common/ProcessWindows.cpp
+++ b/lldb/source/Plugins/Process/Windows/Common/ProcessWindows.cpp
@@ -439,7 +439,7 @@ void ProcessWindows::RefreshStateAfterStop() {
RegisterContextSP register_context = stop_thread->GetRegisterContext();

 // The current EIP is AFTER the BP opcode, which is one byte.

uint64_t pc = register_context->GetPC() - 1;

uint64_t pc = register_context->GetPC() - 4;

BreakpointSiteSP site(GetBreakpointSiteList().FindByAddress(pc));
if (site) {

However I'm still getting the illegal instruction errors when stopping on breakpoints:

$ lldb.exe -- hello-exception.exe
(lldb) target create "hello-exception.exe"
Current executable set to 'C:\code\hello-exception.exe' (aarch64).
(lldb) b main
Breakpoint 1: where = hello-exception.exe`main + 28 at hello-exception.cpp:75:14, address = 0x00000001400016ac
(lldb) run
Process 8728 launched: 'C:\code\hello-exception.exe' (aarch64)
Process 8728 stopped

thread #1, stop reason = Exception 0xc000001d encountered at address 0x7ff7b7dd16ac

llvmbot · 2021-08-30T20:25:12Z

The breakpoint encodings are in two locations. A similar change is needed in source/Target/Platform.cpp.

mstorsjo · 2021-08-30T20:29:48Z

The breakpoint encodings are in two locations. A similar change is needed
in source/Target/Platform.cpp.

Oh, thanks!

With that in place, stopping at the breakpoint is clean, but stepping still fails:

$ lldb.exe -- hello-exception.exe
(lldb) target create "hello-exception.exe"
Current executable set to 'C:\code\hello-exception.exe' (aarch64).
(lldb) b main
Breakpoint 1: where = hello-exception.exe`main + 28 at hello-exception.cpp:75:14, address = 0x00000001400016ac
(lldb) run
Process 7904 launched: 'C:\code\hello-exception.exe' (aarch64)
Process 7904 stopped

thread #1, stop reason = breakpoint 1.1
frame #0: 0x00007ff7b7dd16ac hello-exception.exe`main(argc=1, argv=0x000001eb44014140) at hello-exception.cpp:75:14
[....]
(lldb) step
Process 7904 stopped
thread #1, stop reason = Exception 0xc00000ff encountered at address 0x7fff44a181b0
frame #0: 0x00007fff44a181b0 ntdll.dll`RtlRaiseStatus + 32

llvmbot · 2021-08-30T20:35:53Z

What's the source for your testcase? Is it throwing an exception by any chance (RtlRaiseStatus() sort of suggests this)?

If you run the thing without a breakpoint, does it run to completion, or does it also throw an exception?

mstorsjo · 2021-08-30T20:39:36Z

What's the source for your testcase? Is it throwing an exception by any
chance (RtlRaiseStatus() sort of suggests this)?

If you run the thing without a breakpoint, does it run to completion, or
does it also throw an exception?

Oh, right, yes, my testcase does throw an exception.

If I run on a testcase that doesn't throw any exception, doing "step" just runs the process to completion.

llvmbot · 2021-08-30T20:41:11Z

Here is an example of what I am getting with a trivial hello world application:

C:\eric\test>lldb hello2.exe
(lldb) target create "hello2.exe"
Current executable set to 'C:\eric\test\hello2.exe' (aarch64).
(lldb) b main
Breakpoint 1: where = hello2.exe`main + 16 at hello2.cpp:9, address = 0x0000000140001010
(lldb) run
Process 2872 launched: 'C:\eric\test\hello2.exe' (aarch64)
Process 2872 stopped

thread #1, stop reason = breakpoint 1.1
frame #0: 0x00007ff6b9311010 hello2.exe`main(argc=, argv=) at hello2.cpp:9
6
7 int main(int argc, char * argv[])
8 {
-> 9 cout << "Hello world" << endl;
10 }
(lldb) n
Process 2872 stopped
thread #1, stop reason = step over
frame #0: 0x00007ff6b9311040 hello2.exe`main(argc=, argv=) at hello2.cpp:10
7 int main(int argc, char * argv[])
8 {
9 cout << "Hello world" << endl;
-> 10 }
(lldb)

mstorsjo · 2021-08-30T20:49:56Z

Here is an example of what I am getting with a trivial hello world
application:

C:\eric\test>lldb hello2.exe
(lldb) target create "hello2.exe"
Current executable set to 'C:\eric\test\hello2.exe' (aarch64).
(lldb) b main
Breakpoint 1: where = hello2.exe`main + 16 at hello2.cpp:9, address =
0x0000000140001010
(lldb) run
Process 2872 launched: 'C:\eric\test\hello2.exe' (aarch64)
Process 2872 stopped

thread #1, stop reason = breakpoint 1.1
frame #0: 0x00007ff6b9311010 hello2.exe`main(argc=,
argv=) at hello2.cpp:9
6
7 int main(int argc, char * argv[])
8 {
-> 9 cout << "Hello world" << endl;
10 }
(lldb) n
Process 2872 stopped

thread #1, stop reason = step over
frame #0: 0x00007ff6b9311040 hello2.exe`main(argc=,
argv=) at hello2.cpp:10
7 int main(int argc, char * argv[])
8 {
9 cout << "Hello world" << endl;
-> 10 }
(lldb)

Ok, with that testcase, with an executable built with debug info, stepping does seem to work, but with a more complex testcase, stepping fails and just runs to completion.

llvmbot · 2021-08-30T20:50:19Z

There is a second place where the single-step flag needs to be changed in TargetThreadWindows.cpp in TargetThreadWindows::DoResume()

mstorsjo · 2021-08-30T20:53:37Z

There is a second place where the single-step flag needs to be changed in
TargetThreadWindows.cpp in TargetThreadWindows::DoResume()

Oh, right - then stepping seems to work great too! Thanks!

llvmbot · 2021-09-02T09:47:41Z

I have investigated this issue using a windows on arm surface pro and problem appears to be misconfiguration rather then wrong opcode selection.

Native LLDB should be configured with target triple = aarch64-windows-pc but I see it getting x86_64-windows-pc and when opcode is pulled in wrong opcode is pulled for the same reason.

I have a hack in place where we configure LLVM_TARGET_TRIPLE=aarch64-windows-pc and also change /llvm/lib/Support/Host.cpp with following correction and everything falls into place as far as breakpoints are concerned.

std::string sys::getProcessTriple() {

std::string TargetTripleString = updateTripleOSVersion(LLVM_HOST_TRIPLE);

std::string TargetTripleString = updateTripleOSVersion(LLVM_DEFAULT_TARGET_TRIPLE);

mstorsjo · 2021-09-02T09:55:48Z

I have investigated this issue using a windows on arm surface pro and
problem appears to be misconfiguration rather then wrong opcode selection.

Native LLDB should be configured with target triple = aarch64-windows-pc but
I see it getting x86_64-windows-pc and when opcode is pulled in wrong opcode
is pulled for the same reason.

It sounds like you are struggling with entirely different issues.

The issue at hand here is not about accidentally getting an x86_64 breakpoint opcode - none of us are experiencing that.

It's about there being a subtle difference between OSes on aarch64 about how exactly to produce an aarch64 breakpoint instruction. On Linux, "brk #0" (0x00, 0x00, 0x20, 0xd4) is used (which is what LLDB produces right now), but on Windows, "brk #0xf000" (0x00, 0x00, 0x3e, 0xd4) should be used instead.

Due to this, when stopping at a breakpoint, the process is halted with an invalid instruction exception, instead of gently stopped so it can be resumed.

mstorsjo · 2021-09-14T19:20:12Z

I posted a cleaned up version of these fixes in https://reviews.llvm.org/D109777.

mstorsjo · 2021-09-22T11:14:19Z

I pushed a fix incorporating these changes now in 9f34f75.

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AArch64 debug problems on Windows #51015

AArch64 debug problems on Windows #51015

llvmbot commented Aug 30, 2021

mstorsjo commented Aug 30, 2021

llvmbot commented Aug 30, 2021

mstorsjo commented Aug 30, 2021

llvmbot commented Aug 30, 2021

mstorsjo commented Aug 30, 2021

llvmbot commented Aug 30, 2021

mstorsjo commented Aug 30, 2021

llvmbot commented Aug 30, 2021

mstorsjo commented Aug 30, 2021

llvmbot commented Aug 30, 2021

mstorsjo commented Aug 30, 2021

llvmbot commented Aug 30, 2021

mstorsjo commented Aug 30, 2021

llvmbot commented Sep 2, 2021

mstorsjo commented Sep 2, 2021

mstorsjo commented Sep 14, 2021

mstorsjo commented Sep 22, 2021

AArch64 debug problems on Windows #51015

AArch64 debug problems on Windows #51015

Comments

llvmbot commented Aug 30, 2021

Extended Description

mstorsjo commented Aug 30, 2021

llvmbot commented Aug 30, 2021

mstorsjo commented Aug 30, 2021

llvmbot commented Aug 30, 2021

mstorsjo commented Aug 30, 2021

llvmbot commented Aug 30, 2021

mstorsjo commented Aug 30, 2021

llvmbot commented Aug 30, 2021

mstorsjo commented Aug 30, 2021

llvmbot commented Aug 30, 2021

mstorsjo commented Aug 30, 2021

llvmbot commented Aug 30, 2021

mstorsjo commented Aug 30, 2021

llvmbot commented Sep 2, 2021

mstorsjo commented Sep 2, 2021

mstorsjo commented Sep 14, 2021

mstorsjo commented Sep 22, 2021