Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM Assembler problem with ADR instruction #13613

Open
RichBarton-Arm opened this issue Jun 29, 2012 · 6 comments
Open

ARM Assembler problem with ADR instruction #13613

RichBarton-Arm opened this issue Jun 29, 2012 · 6 comments
Labels
backend:ARM bugzilla Issues migrated from bugzilla

Comments

@RichBarton-Arm
Copy link
Collaborator

Bugzilla Link 13241
Version trunk
OS All
Blocks llvm/llvm-bugzilla-archive#18926
CC @rengolin

Extended Description

The MCInst for the ARM ADR instruction as created by the MC assembler is re-encoded into an ADD/SUB instruction (Reproducers below)

It seems to me that there has been some re-engineering of the ADD/SUB instructions and the ADR instruction has not been taken into account in this.

Reproduce with:

(A1 encoding)
echo 'ADR r0,#0x40000000' | ./llvm-oss/build-none/bin/llvm-mc -triple=armv7 -show-inst -show-encoding
.section __TEXT,__text,regular,pure_instructions
adr r0, #​1073741824 @ encoding: [0x00,0x00,0x8f,0xe2]
@ <MCInst #​30 ADR
@
@
@
@ >

echo 0x00 0x00 0x8f 0xe2 | ./llvm-oss/build-none/bin/llvm-mc -triple=armv7 -show-inst -show-encoding -disassemble
.section __TEXT,__text,regular,pure_instructions
add r0, pc, #​0 @ encoding: [0x00,0x00,0x8f,0xe2]
@ <MCInst #​24 ADDri
@
@
@
@
@
@ >

(A2 encoding)
echo 'ADR r0,#-0x0' | ./llvm-oss/build-none/bin/llvm-mc -triple=armv7 -show-inst -show-encoding
.section __TEXT,__text,regular,pure_instructions
adr r0, #-2147483648 @ encoding: [0x00,0x00,0x4f,0xe2]
@ <MCInst #​30 ADR
@
@
@
@ >

echo 0x00 0x00 0x4f 0xe2 | ./llvm-oss/build-none/bin/llvm-mc -triple=armv7 -show-inst -show-encoding -disassemble
.section __TEXT,__text,regular,pure_instructions
sub r0, pc, #​0 @ encoding: [0x00,0x00,0x4f,0xe2]
@ <MCInst #​456 SUBri
@
@
@
@
@
@ >

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 23, 2012

ARM ARM says below in A4.2.2 Use of labels in UAL instruction syntax,

"When the assembler calculates an offset of 0 for the normal syntax of this instruction, it must assemble the encoding that adds 0 to the Align(PC,4) value of the instruction. The encoding that subtracts 0 from the Align(PC,4) value cannot be specified by the normal syntax."

So the decoding of "0x00 0x00 0x4f 0xe2" is "sub r0, pc, #​0" follows the rule.

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 23, 2012

For thumb2, ARM ARM says,

It is recommended that the alternative syntax forms are avoided where possible. However, the only possible syntax for encoding T2 with all immediate bits zero is
SUB ,PC,#0.

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 23, 2012

Come up with an even general case,

$ echo 'adr r0,#0x4' | llvm-mc -triple=armv7 -show-inst -show-encoding
.section __TEXT,__text,regular,pure_instructions
adr r0, #​4 @ encoding: [0x04,0x00,0x8f,0xe2]
@ <MCInst #​30 ADR
@
@
@
@ >
$ echo 0x04 0x00 0x8f 0xe2 | llvm-mc -triple=armv7 -show-inst -show-encoding -disassemble
.section __TEXT,__text,regular,pure_instructions
add r0, pc, #​4 @ encoding: [0x04,0x00,0x8f,0xe2]
@ <MCInst #​24 ADDri
@
@
@
@
@
@ >

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 24, 2012

/* the following example is incorrect /
$ echo 'adr r5,#0x1234' | llvm-mc -triple=armv7 -show-inst -show-encoding
.section __TEXT,__text,regular,pure_instructions
adr r5, #​4660 @ encoding: [0x34,0x52,0xcf,0xe2]
@ <MCInst #​30 ADR
@
@
@
@ >
/
the following example is correct */
$ echo 0x34 0x52 0xcf 0xe2 | llvm-mc -triple=armv7 -show-inst -show-encoding -disassemble
.section __TEXT,__text,regular,pure_instructions
sbc r5, pc, #​1073741827 @ encoding: [0x0d,0x51,0xcf,0xe2]
@ <MCInst #​330 SBCri
@
@
@
@
@
@ >

/* the following example is correct */
$ echo 'sbc r5, pc, #​1073741827' | llvm-mc -triple=armv7 -show-inst -show-encoding
.section __TEXT,__text,regular,pure_instructions
sbc r5, pc, #​1073741827 @ encoding: [0x0d,0x51,0xcf,0xe2]
@ <MCInst #​330 SBCri
@
@
@
@
@
@ >

This result is quite wired. There are two issues.

  1. #​0x1234 can't be really encoded into adr instruction, assembler only silently keeps the low 12-bit value, ie. 0x234. Need error/warning message.
  2. The coding for 0x234 should be encoded as a modified constant rather than a plain const. So actually even 0x234 can't be really encoded into adr instruction.

@rengolin
Copy link
Member

Hi Richard, is this still a problem?

@rengolin
Copy link
Member

mentioned in issue llvm/llvm-bugzilla-archive#18926

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 3, 2021
vitalybuka pushed a commit that referenced this issue Sep 20, 2023
#66343)

The FileCheck string `LLVMFuzzerCustomMutatorLongSequence: {{.*}} MS:
{{[0-9]*}} {{(([a-zA-Z]*-){11,})}} {{.*}}` is too restrictive and may
fail the test in some case.

If we look at the commit that added this
check(66df989),
This check is for printing out the long mutation sequence, such as this
one
```
#53552  REDUCE cov: 6 ft: 6 corp: 5/9b lim: 4096 exec/s: 0 rss: 37Mb L: 2/3 MS: 54 ChangeByte-PersAutoDict-ChangeBit-ChangeBinInt-ChangeBit-ChangeBit-ChangeByte-CMP-EraseBytes-EraseBytes-CrossOver-InsertRepeatedBytes-ChangeByte-EraseBytes-InsertRepeatedBytes-ShuffleBytes-ChangeByte-ShuffleBytes-ChangeBit-CrossOver-ChangeBit-ShuffleBytes-ChangeBinInt-ShuffleBytes-EraseBytes-InsertByte-Custom-ShuffleBytes-CopyPart-InsertRepeatedBytes-PersAutoDict-InsertRepeatedBytes-ChangeByte-CrossOver-CrossOver-PersAutoDict-PersAutoDict-EraseBytes-ChangeBit-CopyPart-ChangeByte-CopyPart-InsertRepeatedBytes-CrossOver-CrossOver-CrossOver-CrossOver-ShuffleBytes-EraseBytes-InsertByte-InsertRepeatedBytes-CrossOver-EraseBytes-Custom- DE: "\377\377"-"\001\000"-"\001\000"-"\000\000\000\000\000\000\000\000"-"\001\000\000\000"-
```

But if we look at the code doing the printing
```cpp
void MutationDispatcher::PrintMutationSequence(bool Verbose) {
  Printf("MS: %zd ", CurrentMutatorSequence.size());
  size_t EntriesToPrint =
      Verbose ? CurrentMutatorSequence.size()
              : std::min(kMaxMutationsToPrint, CurrentMutatorSequence.size());
  for (size_t i = 0; i < EntriesToPrint; i++)
    Printf("%s-", CurrentMutatorSequence[i].Name);
  if (!CurrentDictionaryEntrySequence.empty()) {
    Printf(" DE: ");
    EntriesToPrint = Verbose ? CurrentDictionaryEntrySequence.size()
                             : std::min(kMaxMutationsToPrint,
                                        CurrentDictionaryEntrySequence.size());
    for (size_t i = 0; i < EntriesToPrint; i++) {
      Printf("\"");
      PrintASCII(CurrentDictionaryEntrySequence[i]->GetW(), "\"-");
    }
  }
}
```

We can see that the `DE: XXX` is not always printed. So the following
output is possible(and is from real-life failure), notince the missing
of `DE: XXX`.
```
#13613  NEW    cov: 5 ft: 5 corp: 4/6b lim: 4096 exec/s: 0 rss: 32Mb L: 2/2 MS: 27 InsertByte-ChangeBinInt-ChangeBinInt-CrossOver-ShuffleBytes-ChangeBit-EraseBytes-ShuffleBytes-InsertByte-InsertRepeatedBytes-CopyPart-InsertByte-ChangeByte-ChangeBit-InsertByte-CrossOver-EraseBytes-CopyPart-ShuffleBytes-EraseBytes-InsertByte-InsertRepeatedBytes-CrossOver-CrossOver-ShuffleBytes-ChangeBit-Custom-
#13765  ......
```
This output is totally legit and will fail that check.

So I remove the check for the following strings, I think `MS: {{[0-9]*}}
{{(([a-zA-Z]*-){11,})}}` is sufficient for checking the long mutation
sequence. This should help resolve the flaky failure of
fuzzer-custommutator.test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:ARM bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

3 participants