Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[x86 Disassembler] Displacement following SIB byte parsed with wrong instruction length with REX.B #19234

Closed
llvmbot opened this issue Feb 16, 2014 · 4 comments
Assignees
Labels
backend:X86 bugzilla Issues migrated from bugzilla

Comments

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 16, 2014

Bugzilla Link 18860
Resolution FIXED
Resolved on Nov 07, 2018 00:22
Version trunk
OS Windows NT
Blocks #11360
Reporter LLVM Bugzilla Contributor
CC @topperc

Extended Description

I've been trying to get my head around the x86 instruction format lately, and have been looking at various disassemblers to understand how it fits together (as well as writing my own as an exercise).

I've noticed that the implementation in LLVM seems to disagree with other assemblers/docs in this detail.

Consider the following instruction bytes (compiled for x86-64)
43 80 04 05 78 56 34 12 00

One online disassembler (http://onlinedisassembler.com/odaweb/) gives the following:
add BYTE PTR [r8*1+0x12345678],0x0

which after much consideration I agree with.

The crux here is that the sib byte (05) corresponds to the special case where the base register is ignored and a 4 byte displacement follows the SIB byte. The docs (both Intel and AMD) are rather unclear on this, but my understanding is that this special case should apply regardless of whether the REX prefix is present; as does the special case for the MODR/M byte that indicates the SIB byte is required. My understanding is that the REX prefix shouldn't affect the length of the instruction.

However, the code in LLVM (from x86DissasemblerDecoder.c, reproduced here) takes the REX.B prefix into account before performing the test, and so does not read the displacement:

base = baseFromSIB(insn->sib) | (bFromREX(insn->rexPrefix) << 3);

switch (base) {
case 0x5:
...

I think, much like the case in the readModRM function the switch should also allow 0xd as so:

base = baseFromSIB(insn->sib) | (bFromREX(insn->rexPrefix) << 3);

switch (base) {
case 0x5:
case 0xd: /* in case REXW.b is set */
...

@llvmbot
Copy link
Collaborator Author

llvmbot commented Feb 16, 2014

assigned to @topperc

@topperc
Copy link
Collaborator

topperc commented Feb 17, 2014

I believe I agree with you for mod=00.

I think for mod=01 and mod=10 with rex.b we need to select r13 as the base and have a displacement.

@topperc
Copy link
Collaborator

topperc commented Feb 17, 2014

Fixed in r201507.

@llvmbot
Copy link
Collaborator Author

llvmbot commented Feb 17, 2014

That was quick!

I agree with your fix, so am marking the bug CLOSED.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 9, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

2 participants