-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lld creates bogus dwarf info when linking gcc object files #39828
Comments
Looking at these errors seems "2, 3, 4 or 5" is a DWARF version expected and it complains about something different is written instead probably. First what I would try is to inspect the binary with an llvm-dwarfdump to see what is wrong with DWARF info. Could you attach a /home/milian/projects/kf5/build-dbg/frameworks/ki18n/bin/ktranscript.so you have? I can take a look. Also, if you can pass -reproduce=sample.tar to the linker (and provide us a reproducible sample), that would be great. One more idea: did you try to reproduce it with the latest LLD sources? |
I haven't tried it with the latest LLD sources yet, I probably need to compile all of LLVM for that too, right? I might revive my build setup for that and check if you can't find anything obvious from the sample or llvm-dwarfdump yourself. the sample can be downloaded from here: https://swanson.kdab.com/owncloud/index.php/s/4tTkKnaytBGg7HZ (it's too large to attach it here directly) |
Yes.
I do not see anything unusual with ktranscript.so. llvm-dwarfdump shows it has a single compile unit: .debug_info contents: The header looks fine to me and has DWARF version 4. I guess it might be a compiler issue and not a linker one probably. |
Oh, and GNU dwafdump reports an error:
dwarfdump ERROR: dwarf_offdie: DW_DLE_VERSION_STAMP_ERROR (48) I also tried the sample you provided with the latest version of LLD and the error above still present. I am afraid I have no ideas atm what might be wrong. |
Hey, I just checked with ld.lld v8.0.0 and the issue persists. Using clang to compile instead, the issue disappears. Nevertheless, this isn't a real fix, is it? gcc and ld.lld should work together, no? Also, considering that the GCC produced .o files work fine with ld.bfd and ld.gold, I still believe that it's somehow ld.lld that is missing support for something that leads to this issue here? |
It'd certainly be worth understanding what lld's doing differently from binutils ld on GCC's output. But without knowing the specifics it's hard to know where the bug is. Do you have/could you try to make a small reproducer? |
We are seeing the same or a similar issue when we enable ASan in gcc: $ cat test.c Should we create a new issue for this? We are running Arch Linux with gcc and gdb 8.2.1, lld 8.0.0, and binutils 2.31.1 from the official repositories. This behavior was found in the course of the SYMBIOSYS research project at COMSYS, RWTH Aachen University. This research is supported by the European Research Council (ERC) under the EU's Horizon 2020 Research and Innovation Programme grant agreement n. 647295 (SYMBIOSYS). |
Can't seem to reproduce it with this selection of versions, at least: $ ./ld --version |
This may be a problem specific to Arch Linux since I can't reproduce it on Debian and Fedora either. However, on Arch Linux, I can reproduce it even when I use the exact same GCC and LLVM versions as you. I think Milian is running Arch Linux as well, because the libraries in his reproducable build match mine exactly. |
Does the Arch version of Clang have extra patches in it? I remember |
yes, I'm using Arch - I'll try to finally get a source build of clang up and running again and then see if the issue is resolved. the patches that Arch is applying can be found here: https://git.archlinux.org/svntogit/packages.git/tree/trunk?h=packages/clang |
reproducer gcc is gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2) lld is eaf92ac I am happy to add the gcc produced .o files it this still doesn't reproduce for you. |
Any chance of further reduction? (I mean, I don't tend to do LLD development anyway - so perhaps someone else will pick it up with what's in this thread already - I'm just kind of curious about what's causing this), potentially taking assembly & seeing if it's assembled differently by GCC (or if taking GCC's assembly and assembling it with Clang's integrated assembler reproduces the same odd behavior in LLD) - and reducing the assembly might make it easier to see what the interesting feature is |
reproducer I left the .s files in the archive, so it should be possible to reproduce this without gcc. Note that the debug info in udp.s must be compressed for this to reproduce. |
reproducer 52 ip_checksum.s % gcc -c ip_checksum.s Replacing lld with gold avoids the readelf warnings. Not using -gz and lld also avoids the readelf warning. |
Thanks for providing this! I wasn't able to reproduce the readelf (2.28 in my case) warning I also tried using llvm-mc with -compress-debug-sections option (for udp.s) to For me, readelf reports no warnings, but dwarfdump (-V 2018-01-29 11:08:35-08:00) reports: My tools and environment are a bit outdated, so I plan to update everything and retry soon. |
Interesting that the tools versions are important. Maybe gold had a similar bug in the past? The tools I have are GNU readelf version 2.31.1-24.fc29 |
I suspect that there may be something broken in your assembly files now, because dwarfdump already complains when I call it on udp.o, even when I assemble it with clang. Neither readelf nor llvm-dwarfdump was able to detect that error though. |
Had no success either on a fresh ubuntu: umb@ubuntu:~/tests/gdb-issue$ lsb_release -a My gcc -v is as -v shows: When I do: I get: If I use GNU as I have the same (expected I think): .debug_info llvm-mc (latest atm) shows: .debug_info Then if I try to readelf the output produced by lld/gold: umb@ubuntu: I have: umb@ubuntu:~/tests/gdb-issue$ Compilation Unit @ offset 0x0: umb@ubuntu: Compilation Unit @ offset 0x0: i.e. no visible errors from readelf here. |
Perhaps is that what we might want to try then? |
testcase
hopefully that will let us find what is different between our system :-) |
Thanks! New results are below. For start, I tried to check the objects and binaries from your archive
umb@ubuntu:~/tests/2$ readelf --debug-dump=info udp.o
umb@ubuntu:~/tests/2$ readelf --debug-dump=info test.gold Compilation Unit @ offset 0x0: umb@ubuntu:~/tests/2$ readelf --debug-dump=info test.lld Compilation Unit @ offset 0x0:
umb@ubuntu:~/tests/2$ Compilation Unit @ offset 0x0: umb@ubuntu: Compilation Unit @ offset 0x0: i.e. both LLD and gold produce the broken output for me with your objects. Next thing I plan to do is probably to build the latest binutils from source and debug the readelf I'll also try linking with the gold built from head sources. |
OK, great! I was able to reproduce it finally. When I use the latest Binutils, readelf does not show me error for udp.o umb@ubuntu: Default readelf (2.31.1) still complains: umb@ubuntu: Then when I link those objects with latest gold, I see the correct output: umb@ubuntu:~/tests/2$ Compilation Unit @ offset 0x0: Now the same objects with almost latest LLD: umb@ubuntu:~/tests/2$ Compilation Unit @ offset 0x0: It's a good point to continue investigation I think. Thanks for the inputs, Rafael! |
I found the issue finally. Here is the partial readelf -a udp.o output: Section Headers: You may notice that .debug_info is Compressed and aligned to 8 bytes. Below is disasm of gold and LLD output of the beginning of the second compile unit from the executable gold: LLD:
Looks like after decompression of section we should probably reset its Align field to 1. Or at least we obviously do not want to add any zeroes to .debug_info by ourselves in LLD, because it damages its final content. Tomorrow I am going to look into gold's source code to see what it does and hopefully will prepare a patch. |
The patch to readelf that avoids the warning is The .debug_info section in udp.o is aligned to 8, but the Elf64_Chdr compression header says that the compressed data has an alignment of 1. It looks like lld is and readelf was using the alignment of the compressed data as the alignment of the uncompressed data. |
Yes, you are right. I forgot that we also have a ch_addralign field in a GNU compressed headers. It was used for compressing but never was used for decompressing sections in LLD. The patch for this bug is: https://reviews.llvm.org/D60959 Also, that revealed one more bug: While GNU assembler version 2.32.51 (x86_64-pc-linux-gnu) sets it to 8 for me now. I think llvm-mc should do the same (perhaps it should be 4 for 32 bit target), auto *Hdr = reinterpret_cast<const Chdr64 *>(RawData.data()); might cause a UB. (C11 standard (§ 6.3.2.3, paragraph 7): |
This bug should be fixed in r358885 Patch for llvm-mc was posted here: |
It is fixed for me. Thank you so much! I agree that producers should set the alignment of the compressed section to 4 or 8 since the compression header requires that alignment. |
I can confirm that this fixes the bug for me as well. Thank you! |
Closing it then. Thanks, everyone for comments and help! |
I also just tested this and it works like a charm! Can the fix be backported into the 8.x branch please? It applies cleanly. Thanks a lot everyone for the help |
Yeah, it looks like this is a good candidate for a dot release. George, can you file a cherry-pick request? |
Done: llvm/llvm-bugzilla-archive#41580 |
*** Bug llvm/llvm-bugzilla-archive#42401 has been marked as a duplicate of this bug. *** |
mentioned in issue llvm/llvm-bugzilla-archive#40532 |
mentioned in issue llvm/llvm-bugzilla-archive#41580 |
1 similar comment
mentioned in issue llvm/llvm-bugzilla-archive#41580 |
mentioned in issue llvm/llvm-bugzilla-archive#42401 |
Extended Description
$ g++ --version
g++ (GCC) 8.2.1 20181127
$ ld --version
LLD 7.0.1 (compatible with GNU linkers)
$ gdb --version
GNU gdb (GDB) 8.2.1
I just tried to switch to ld.lld from ld.gold for performance reasons, but hit a road block: Apparently the DWARF emitted into the final executable or library is bogus and leads to issues when trying to load it then in consumers like gdb, valgrind, bloaty, ...
Here are some examples from a single KDE projects (ki18n). Note that I was so far not able to reproduce this in a simplified standalone example.
Reading symbols from bin/libKF5I18n.so...Dwarf Error: bad offset (0xcc08000026980004) in compilation unit header (offset 0x6fbe2 + 6) [in module /home/milian/projects/kf5/build-dbg/frameworks/ki18n/bin/libKF5I18n.so.5.54.0]
Reading symbols from ktranscript.so...Dwarf Error: wrong version in compilation unit header (is 422, should be 2, 3, 4 or 5) [in module /home/milian/projects/kf5/build-dbg/frameworks/ki18n/bin/ktranscript.so]
Reading symbols from ki18n-ktranscripttest...Dwarf Error: wrong version in compilation unit header (is 1024, should be 2, 3, 4 or 5) [in module /home/milian/projects/kf5/build-dbg/frameworks/ki18n/bin/ki18n-ktranscripttest]
I have seen at least one more variation of the version header issue (version 514).
When I instead link with ld.gold, none of these issues show up.
The text was updated successfully, but these errors were encountered: