Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llvm-tblgen fails to run during a release build with lld as host linker on arm macs #48788

Closed
nico opened this issue Mar 5, 2021 · 6 comments
Labels
bugzilla Issues migrated from bugzilla lld:MachO

Comments

@nico
Copy link
Contributor

nico commented Mar 5, 2021

Bugzilla Link 49444
Resolution FIXED
Resolved on Mar 05, 2021 09:30
Version unspecified
OS All
CC @gkmhub,@int3,@smeenai
Fixed by commit(s) https://reviews.llvm.org/rG210cc0738bbeecf97c9698e2bbe54bbb7d520387

Extended Description

thakis@M1BP llvm-project % time caffeinate ninja -C out/gn lld
ninja: Entering directory `out/gn'
[11/1390] ACTION //lld/wasm:Options(//llvm/utils/gn/build/toolchain:unix)
FAILED: gen/lld/wasm/Options.inc
python3 ../../llvm/utils/gn/build/run_tablegen.py bin/llvm-tblgen --write-if-changed -I ../../llvm/include -I ../../lld/wasm -d gen/lld/wasm/Options.inc.d -o gen/lld/wasm/Options.inc ../../lld/wasm/Options.td -gen-opt-parser-defs
LLVM ERROR: out of memory
Allocation failed
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
LLVM ERROR: out of memory
Allocation failed

Doesn't happen in a debug build.

@nico
Copy link
Contributor Author

nico commented Mar 5, 2021

(lldb) bt

  • thread #​1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
    • frame #​0: 0x00000001958a7cec libsystem_kernel.dylib__pthread_kill + 8 frame #​1: 0x00000001958d8c24 libsystem_pthread.dylibpthread_kill + 292
      frame #​2: 0x0000000195820864 libsystem_c.dylibabort + 104 frame #​3: 0x0000000100274858 llvm-tblgenllvm::report_bad_alloc_error(char const*, bool) + 140
      frame #​4: 0x000000010028064c llvm-tblgenllvm::SmallVectorBase<unsigned long long>::grow_pod(void*, unsigned long, unsigned long) + 224 frame #&#8203;5: 0x0000000100184f34 llvm-tblgenvoid llvm::SmallVectorImpl::append<char const*, void>(char const*, char const*) + 144
      frame #​6: 0x00000001003137c0 llvm-tblgenllvm::BinOpInit::getStrConcat(llvm::Init*, llvm::Init*) + 116 frame #&#8203;7: 0x0000000100336630 llvm-tblgenQualifyName(llvm::Record&, llvm::MultiClass*, llvm::Init*, llvm::StringRef) + 72
      frame #​8: 0x000000010033b3d8 llvm-tblgenllvm::TGParser::ParseDeclaration(llvm::Record*, bool) + 288 frame #&#8203;9: 0x000000010033bc18 llvm-tblgenllvm::TGParser::ParseTemplateArgList(llvm::Record*) + 88
      frame #​10: 0x000000010033e8d8 llvm-tblgenllvm::TGParser::ParseClass() + 696 frame #&#8203;11: 0x00000001003406a0 llvm-tblgenllvm::TGParser::ParseFile() + 92
      frame #​12: 0x000000010030ce18 llvm-tblgenllvm::TableGenMain(char const*, bool (*)(llvm::raw_ostream&, llvm::RecordKeeper&)) + 828 frame #&#8203;13: 0x000000010022b228 llvm-tblgenmain + 116
      frame #​14: 0x00000001958f4f34 libdyld.dylib`start + 4

@nico
Copy link
Contributor Author

nico commented Mar 5, 2021

What's basically going wrong here is that ConcatStringInits() in lib/TableGen/Record.cpp creates a SmallString<80> on the stack and passes it to append() (the SmallString<> ctor gets optimized to an append() call). But in append, the code thinks that the SmallVector (which backs SmallString) has a size of 7953764196914526053 instead of 0, and then things go wrong.

objdump -dr of the relevant bits of Record.o (I added an attribute((noinline)) to ConcatStringInits()), with lots of notes since I'm fairly new to arm64 asm:

in-memory layout of SmallVector, all words 64 bit:
void *BeginX;
Size_T Size = 0, Capacity;
storage (so at start + 24)

0000000000004cb0 _ZL17ConcatStringInitsPKN4llvm10StringInitES2:
4cb0: ff 83 02 d1 sub sp, sp, #​160
4cb4: f6 57 07 a9 stp x22, x21, [sp, #​112]
4cb8: f4 4f 08 a9 stp x20, x19, [sp, #​128]
4cbc: fd 7b 09 a9 stp x29, x30, [sp, #​144]
4cc0: fd 43 02 91 add x29, sp, #​144
^ set up stack frame

4cc4: f3 03 01 aa                  	mov	x19, x1
4cc8: f4 03 00 aa                  	mov	x20, x0

^ copy parameters to temp regs

4ccc: 08 00 00 90                  	adrp	x8, #&#8203;0
	0000000000004ccc:  ARM64_RELOC_GOT_LOAD_PAGE21	___stack_chk_guard
4cd0: 08 01 40 f9                  	ldr	x8, [x8]
	0000000000004cd0:  ARM64_RELOC_GOT_LOAD_PAGEOFF12	___stack_chk_guard
4cd4: 08 01 40 f9                  	ldr	x8, [x8]  
4cd8: a8 83 1d f8                  	stur	x8, [x29, #-40]   => store *__stack_chk_guard on stack

^ stack guard setup

actually interesting code below
4cdc: 01 a0 41 a9 ldp x1, x8, [x0, #​24] => get stringref ptr and len
4ce0: 22 00 08 8b add x2, x1, x8 => x2 end ptr

4ce4: e8 03 00 91                  	mov	x8, sp
4ce8: 15 61 00 91                  	add	x21, x8, #&#8203;24  => init BeginX at start of stack
4cec: f5 03 00 f9                  	str	x21, [sp] => first word of smallstring on stck

this here then inits size and capacity via a constant island as far as I understand. That's what goes wrong:
4cf0: 08 00 00 90 adrp x8, #​0
0000000000004cf0: ARM64_RELOC_PAGE21 lCPI86_0
4cf4: 00 01 c0 3d ldr q0, [x8]
0000000000004cf4: ARM64_RELOC_PAGEOFF12 lCPI86_0
4cf8: e0 83 80 3c stur q0, [sp, #​8] => stores size on stack. this goes wrong. Oh I guess also stores cap?

and then we call append with x0, x1, x2:
4cfc: e0 03 00 91 mov x0, sp
4d00: 00 00 00 94 bl #0 <_ZL17ConcatStringInitsPKN4llvm10StringInitES2+0x50>
0000000000004d00: ARM64_RELOC_BRANCH26 _ZN4llvm15SmallVectorImplIcE6appendIPKcvEEvT_S5

The size/cap init lines after linking with ld64:

0x1002e0d4c <+64>:  nop    
0x1002e0d50 <+68>:  ldr    q0, 0x100348590           ; std::__1::piecewise_construct + 133
0x1002e0d54 <+72>:  stur   q0, [sp, #&#8203;0x8]

Same after linking with lld:

0x1003140a4 <+64>:  adrp   x8, 109
0x1003140a8 <+68>:  ldr    q0, [x8, #&#8203;0xbf00]
0x1003140ac <+72>:  stur   q0, [sp, #&#8203;0x8]

ld64 does a relaxation optimization (not just here, but in many places), but that in itself isn't the cause. The piecewise_construct in ld64's output looks like just a coincidence.

I don't know how to dump the constant island data yet. Maybe the wrong data is stored there (do constant islands have relocations?), or alignment to it is wrong or what. There are tons of ARM64_RELOC_PAGE21 / ARM64_RELOC_PAGEOFF12 relocs in the output so lld likely doesn't get those relocs completely wrong.

@int3
Copy link
Contributor

int3 commented Mar 5, 2021

I haven't confirmed this yet, but I was looking at a different crash today that makes me believe that our ARM64_RELOC_PAGE21 / ARM64_RELOC_PAGEOFF12 relocation handling is buggy. I think for ARM64_RELOC_PAGE21 we need to take the difference between the address of the page containing the pc and the address of the page containing the symbol, and right now we are taking the difference between the pc and the symbol and then rounding it to the nearest page. Will look into it more tomorrow

@nico
Copy link
Contributor Author

nico commented Mar 5, 2021

If I run objdump -rD with a capital D, the output contains:

00000000000187b8 lCPI82_0:
187b8: 00 00 00 00 udf #0
187bc: 20 00 00 00 udf #404

00000000000187c0 lCPI87_0:
187c0: 00 00 00 00 udf #0
187c4: 08 00 00 00 udf #380

But no lCPI86_0.

..aha!:

thakis@M1BP gn % nm ./obj/llvm/lib/TableGen/TableGen.Record.o | grep lCPI86_0
0000000000018920 s lCPI86_0
thakis@M1BP gn % nm ./obj/llvm/lib/TableGen/TableGen.Record.o | grep 0000000000018920
0000000000018920 s lCPI86_0
0000000000018920 s ltmp3

So ltmp3 and lCPI86_0 are the same address. (Is there some way to tell objdump to not symbolize reloc destinations?)

Slightly easier to see with dsymutil which sorts by address instead of symbol name:

thakis@M1BP gn % dsymutil -s ./obj/llvm/lib/TableGen/TableGen.Record.o | grep -C1 lCPI86_0
[ 80] 000086be 0e ( SECT ) 04 0000 0000000000018920 'ltmp3'
[ 81] 000089f2 0e ( SECT ) 04 0000 0000000000018920 'lCPI86_0'
[ 82] 000089cc 0e ( SECT ) 02 0000 00000000000187c0 'lCPI87_0'

and there's an ltmp3:

0000000000018920 :
...
18928: 50 00 00 00 udf #452
1892c: 00 00 00 00 udf #0

That's 80 for the capacity of the SmallString<80> and 0 for the size. (Swapped because little endian and use of stur to store it to the stack.)

So the relocation to that data island is somehow not done correctly by lld.

@nico
Copy link
Contributor Author

nico commented Mar 5, 2021

I think for ARM64_RELOC_PAGE21 we need to take the difference between the address of the page containing the pc and the address of the page containing the symbol, and right now we are taking the difference between the pc and the symbol and then rounding it to the nearest page.

If I read the code right, we already do what you think we should do.

I think what's going wrong is that encodePageOff12 doesn't check for vector ops when computing scale. This seems to fix it:

-inline uint64_t encodePageOff12(uint64_t base, uint64_t va) {

  • int scale = ((base & 0x3b000000) == 0x39000000) ? base >> 30 : 0;
    +inline uint64_t encodePageOff12(uint32_t base, uint64_t va) {
  • int scale = ((base & 0x3b00'0000) == 0x3900'0000) ? base >> 30u : 0;
  • if ((base & 0x3b00'0000) == 0x3900'0000 && scale == 0)
  • if ((base & 0x0480'0000) == 0x0480'0000)
  •  scale = 4;
    

I'll read some more and do some more testing, then I'll add tests and send it out.

@nico
Copy link
Contributor Author

nico commented Mar 5, 2021

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla lld:MachO
Projects
None yet
Development

No branches or pull requests

2 participants