Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

windows x64 exception data is innacurate #46803

Open
jwitmann opened this issue Sep 8, 2020 · 10 comments
Open

windows x64 exception data is innacurate #46803

jwitmann opened this issue Sep 8, 2020 · 10 comments
Labels
bugzilla Issues migrated from bugzilla clang:to-be-triaged Should not be used for new issues platform:windows

Comments

@jwitmann
Copy link

jwitmann commented Sep 8, 2020

Bugzilla Link 47459
Version unspecified
OS Windows NT
Attachments fix for wrong $ip2state records.
CC @majnemer,@rnk

Extended Description

Hi,

While I was scrutinizing some generated code inside IDA PRO, I realized the $ip2state records are wrong by 1 byte... which in turn make the disassembler display the wrong start for try[]. This is bad, as it would dispatch the wrong exception handler if an error happens on the 1st instruction.

I spent some time and found that the source of the problem is WinException::getLabel() used in computeIP2StateTable. While the function correctly build a label plus 1 for other EH table uses, in computeIP2StateTable it should really not add 1. So I create a new function that just does that and it resolved the issue.

I attach a patch... I don't really have time to do more than this, nor can i provide a test file.

Best regards,
Jerome WITMANN

@majnemer
Copy link
Mannequin

majnemer mannequin commented Sep 8, 2020

Hi! Can you please send this patch to llvm-commits?

@rnk
Copy link
Collaborator

rnk commented Sep 8, 2020

I think IDA PRO must be assuming that EH labels mark instruction boundaries, but LLVM doesn't uphold that invariant.

I haven't tested your patch, but I believe it would actually break exception handling, because I believe ip2state ranges are half-open (inclusive left, exclusive right).

Consider this kind of input:

void mayThrow();
void mayNotThrow();
void f() {
  try {
    mayThrow();
  } catch(...) {}
  mayNotThrow();
}

Clang generates this code sequence:

.Ltmp0:
        callq   "?mayThrow@@YAXXZ"
.Ltmp1:
        callq   "?mayNotThrow@@YAXXZ"

The labels are used in the ip2state table with a one byte offset so that exceptions raised at the .Ltmp1 return address are caught.

MSVC inserts a nop before ip2state transitions. See this code sequence:

        call    ?mayThrow@@YAXXZ                        ; mayThrow
        npad    1
$LN7@f:
; Line 7
        call    ?mayNotThrow@@YAXXZ                     ; mayNotThrow

Inserting that extra nop is difficult in LLVM, but looking at it now, I think it is feasible.

The ip2state label logic is mostly controlled by InvokeStateChangeIterator, which iterates the MachineInstrs and finds the labels that matter. We could actually hook EHStreamer::endInstruction to emit a nop at the end of every ip2state range.

LLVM's ip2state+1 label bias has been a surprisingly common source of confusion for users, so this might be worth doing, but it's sufficiently hard that it's not likely to happen soon. It may be a pre-requisite for more fine-grained SEH, so maybe it will happen along the way to that.

Honestly, you might get more traction asking IDA for a fix. It's probably much easier for them to check if ip2state labels are off by one.

@jwitmann
Copy link
Author

jwitmann commented Sep 9, 2020

Hi Reid,

I disagree, your assumptions are just that… I verified the output of the MS compiler first, and there is no added padding in my test output EVER. The IP2State start label in MS points to the start of an instruction not after the byte of the instruction. Case in point the first entry of the ip2state is the start of the function, then all the other items points inside an instruction.

Example:

Microsoft CL 2017

.xdata:00000000000150B8 $ip2state$?_main@@YAHXZ IPtoStateMap <rva loc_C1AF, -1>
.xdata:00000000000150B8                    ; .rdata:$cppxdata$?_main@@YAHXZ↑o
.xdata:00000000000150C0                         IPtoStateMap <rva loc_C320, 0>
.xdata:00000000000150C8                         IPtoStateMap <rva loc_C348, -1>
.xdata:00000000000150D0                         IPtoStateMap <rva loc_CC63, 1>
...

.text$mn:000000000000C318                  mov     rbx, rax
.text$mn:000000000000C31B                  mov     [rsp+910h+var_8D0], rax
.text$mn:000000000000C31B
.text$mn:000000000000C320
.text$mn:000000000000C320 loc_C320:                ; .xdata:00000000000150C0↓o
.text$mn:000000000000C320 ;   try {
.text$mn:000000000000C320                  xor     r13d, r13d
.text$mn:000000000000C323                  test    rax, rax
.text$mn:000000000000C326                  jz      short loc_C345
.text$mn:000000000000C326
.text$mn:000000000000C328                  xor     eax, eax
.text$mn:000000000000C32A                  mov     [rbx], rax
.text$mn:000000000000C32D                  mov     [rbx+8], rax
.text$mn:000000000000C331                  mov     rcx, rbx        ; this
.text$mn:000000000000C334                  call    BaseDriver::BaseDriver(void)
.text$mn:000000000000C334
.text$mn:000000000000C339                  lea     rax, const WinDriver::`vftable'
.text$mn:000000000000C340                  mov     [rbx], rax
.text$mn:000000000000C343                  jmp     short loc_C348
.text$mn:000000000000C343
.text$mn:000000000000C345 ; --------------------------------------------------
.text$mn:000000000000C345
.text$mn:000000000000C345 loc_C345:        ; _main(void)+1B6↑j
.text$mn:000000000000C345                  mov     rbx, r13
.text$mn:000000000000C345 ;   } // starts at C320
.text$mn:000000000000C345
.text$mn:000000000000C348
.text$mn:000000000000C348 loc_C348:        ; _main(void)+1D3↑j
.text$mn:000000000000C348                  ; .xdata:00000000000150C8↓o

CLANG 11 rc2:

.xdata:0000000000034D3C $ip2state$?_main@@YAHXZ IPtoStateMap <rva _main(void), -1>
.xdata:0000000000034D3C                     ; .xdata:$cppxdata$?_main@@YAHXZ↑o
.xdata:0000000000034D44                     IPtoStateMap <rva loc_245FC, 0>
.xdata:0000000000034D4C                     IPtoStateMap <rva loc_2460B, -1>
.xdata:0000000000034D54                     IPtoStateMap <rva loc_24E98, 6>

.text:00000000000245F5                     xorps   xmm6, xmm6
.text:00000000000245F8                     movaps  xmmword ptr [rax], xmm6
.text:00000000000245F8
.text:00000000000245F8 ; -----------------------------------------------------
.text:00000000000245FB                     db 48h
.text:00000000000245FC ; -----------------------------------------------------
.text:00000000000245FC
.text:00000000000245FC loc_245FC:          ;.xdata:0000000000034D44↓o
.text:00000000000245FC ;   try {
.text:00000000000245FC                     mov     [rbp+540h], eax
.text:0000000000024602                     mov     rcx, rax
.text:0000000000024605                     call    BaseDriver::BaseDriver(void)
.text:0000000000024605
.text:0000000000024605 ; ------------------------------------------------------
.text:000000000002460A                     db 48h
.text:000000000002460A ;   } // starts at 245FC
.text:000000000002460B ; ------------------------------------------------------
.text:000000000002460B
.text:000000000002460B loc_2460B:                                      ; .xdata:0000000000034D4C↓o
.text:000000000002460B                     lea     eax, const WinDriver::`vftable'
.text:0000000000024611                     mov     rcx, [rbp+540h]
.text:0000000000024618                     mov     [rcx], rax
.text:000000000002461B                     mov     qword ptr cs:BaseDriver * driver, rcx
.text:0000000000024622                     mov     ecx, 8
.text:0000000000024627                     call    operator new(unsigned __int64)

Granted the optimization applied are different, however the ip2state layout and data is eloquent enough in my view.

Best regards,
Jerome

@jwitmann
Copy link
Author

jwitmann commented Sep 9, 2020

Also to be clear, I didn't report this because of IDA output (I don't care) but because I am concerned that the exception table is wrong and would trigger the wrong EH if a problem arose at the start of the block defined in ip2state.

Jerome

@rnk
Copy link
Collaborator

rnk commented Sep 9, 2020

Hi Reid,

I disagree, your assumptions are just that… I verified the output of the MS
compiler first, and there is no added padding in my test output EVER. The
IP2State start label in MS points to the start of an instruction not after
the byte of the instruction. Case in point the first entry of the ip2state
is the start of the function, then all the other items points inside an
instruction.

I think we are in agreement: MSVC's ip2state labels always point to an instruction boundary, Clang's do not. I agree that there are no nops in your example, but it is because they are not necessary: the last instruction in the try range (jmp in this case) cannot throw an exception.

In my example, when optimizations are enabled, MSVC may insert a nop (npad 1) to ensure that there is a non-throwing instruction at the end of the try range.

Also to be clear, I didn't report this because of IDA output (I don't care)
but because I am concerned that the exception table is wrong and would
trigger the wrong EH if a problem arose at the start of the block defined in
ip2state.

LLVM only supports catching exceptions thrown from a call (/EHs), so the problem is somewhat simplified. If an exception is raised from a call instruction, it is always raised at the return address, and every call instruction is at least one byte long. So, I believe our tables do work for the use cases that we support.

If we support non-call exceptions (/EHa), then we will have to re-evaluate this decision. Consider, for example, a single-byte instruction that traps and raises an exception. It would probably be handled in the wrong EH state, but we don't currently support such a thing.

@jwitmann
Copy link
Author

jwitmann commented Sep 10, 2020

Hi Reid,

After looking more into the output, I did notice that a nop is inserted after a call with VC. I was under the impression that it was -EHa not -EHs. That being said, I am not sure why emitting a nop after a call would be that difficult at that stage, it's the AsmPrinter after all (to mimic MS behaviour). I might have a go myself, and eventually fight with the optimizer again.

I suppose you should close this bug then.

Jerome

@rnk
Copy link
Collaborator

rnk commented Sep 10, 2020

Hi Reid,

After looking more into the output, I did notice that a nop is inserted
after a call with VC. I was under the impression that it was -EHa not -EHs.
That being said, I am not sure why emitting a Nop after a call would be that
difficult at that stage, it's the AsmPrinter after all (to mimic MS
behaviour). I might have a go myself, and eventually fight with the
optimizer again.

I suppose you should close this bug then.

This isn't the first time someone has been confused by this label+1 design choice, so IMO we can keep this open. It would be nice for these labels to point to instruction boundaries, even if it's not a correctness problem for us right now. Good luck with any attempts! There is some existing functionality for inserting nops before the epilogue to avoid similar problems, I think if you grep for SEH_Epilogue you might find it.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
@namazso
Copy link
Contributor

namazso commented Apr 7, 2022

@rnk the lack of nops does cause correctness problems though:

#include <intrin.h>
#include <cstdio>

__declspec(noinline) void do_nothing() {}
__declspec(noinline) void test()
{
	__try {
		do_nothing();
	} __except(1) {
		printf("hello\n");
	}
	__debugbreak();
}
int main()
{
	__try {
		test();
	} __except(1) {
		printf("world\n");
	}
	return 0;
}

This will infinitely print out "hello" on LLVM 14.0.0
Should I open a new issue for this?

@rnk
Copy link
Collaborator

rnk commented Apr 15, 2022

Yes, I think this is a separate issue. I think it would be reasonable to add nops to functions using SEH try, since they have the ability to catch exceptions from single byte instructions such as int3 from __debugbreak in this case. C++ exception regions are very, very common (every variable with a destructor, such as strings), and __try regions are rare.

@namazso
Copy link
Contributor

namazso commented Apr 15, 2022

I'll open a new issue for this then.

In the meantime I also realized that this current issue about the start of the ranges being misaligned is actually working as intended. If the range were to start at the begin of the call instruction, it would catch exceptions raised from the call instruction itself (which is same frame, therefore officially not supperted as of now), rather than a lower frame (which is supported by clang). A call instruction can fail when calling a noncanonical address, for example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla clang:to-be-triaged Should not be used for new issues platform:windows
Projects
None yet
Development

No branches or pull requests

4 participants