LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 3494 - assertion thrown in codegen of debug info "Multiple main"
Summary: assertion thrown in codegen of debug info "Multiple main"
Status: RESOLVED FIXED
Alias: None
Product: new-bugs
Classification: Unclassified
Component: new bugs (show other bugs)
Version: unspecified
Hardware: PC Linux
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
: 3718 (view as bug list)
Depends on:
Blocks:
 
Reported: 2009-02-05 23:49 PST by Nick Lewycky
Modified: 2009-07-01 14:09 PDT (History)
9 users (show)

See Also:
Fixed By Commit(s):


Attachments
testcase (1.36 KB, application/octet-stream)
2009-02-05 23:49 PST, Nick Lewycky
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nick Lewycky 2009-02-05 23:49:04 PST
Created attachment 2503 [details]
testcase

When building these with LLVMgold:

  $ cat test.c
  extern int foo(void);
  int main (void) {
   return foo();
  }
  $ cat test2.c
  int foo (void) {
   return 0;
  }

we get an assertion failure. Note that there's only one main.

$ llc linked.bc
llc: DwarfWriter.cpp:2786: void llvm::DwarfDebug::ConstructCompileUnits(): Assertion `!MainCU && "Multiple main compile units are found!"' failed.
[New Thread 0xf7c296c0 (LWP 23830)]

Program received signal SIGABRT, Aborted.
[Switching to Thread 0xf7c296c0 (LWP 23830)]
0xffffe425 in __kernel_vsyscall ()
(gdb) bt
#0  0xffffe425 in __kernel_vsyscall ()
#1  0xf7c56640 in raise () from /lib/i686/cmov/libc.so.6
#2  0xf7c58018 in abort () from /lib/i686/cmov/libc.so.6
#3  0xf7c4f5be in __assert_fail () from /lib/i686/cmov/libc.so.6
#4  0x08a9412a in llvm::DwarfDebug::ConstructCompileUnits (this=0x8f983a8)
    at DwarfWriter.cpp:2786
#5  0x08a946e5 in llvm::DwarfDebug::SetDebugInfo (this=0x8f983a8, 
    mmi=0x8f94a88) at DwarfWriter.cpp:2893
#6  0x08a729a0 in llvm::DwarfWriter::BeginModule (this=0x8f8c1d0, M=0x8f73840, 
    MMI=0x8f94a88, OS=@0x8f786b8, A=0x8f965b0, T=0x8f7e028)
    at DwarfWriter.cpp:4233
#7  0x0872868b in llvm::X86ATTAsmPrinter::doInitialization (this=0x8f965b0, 
    M=@0x8f73840) at X86ATTAsmPrinter.cpp:742
#8  0x08cdbe0c in llvm::FPPassManager::doInitialization (this=0x8f76f68, 
    M=@0x8f73840) at PassManager.cpp:1356
#9  0x08cdc043 in llvm::FunctionPassManagerImpl::doInitialization (
    this=0x8f74138, M=@0x8f73840) at PassManager.cpp:1250
#10 0x08cd2d54 in llvm::FunctionPassManager::doInitialization (this=0xffe2c5b8)
    at PassManager.cpp:1233
#11 0x08414c01 in main (argc=2, argv=0xffe2c6d4) at llc.cpp:311

The intermediate result is 'linked.bc' attached. I've decided not to bugpoint it in case that discards useful information. (Is the .bc bad? Or is it a bug in dwarf emission?)
Comment 1 Duncan Sands 2009-02-06 02:01:12 PST
I saw the same thing when doing LTO on a program with debug info.
This program also had only one main!
Comment 2 Devang Patel 2009-02-06 12:28:39 PST
This is expected because LTO is not updated to handle debug info yet.

The FE, marks one compile unit as "main". Here "main" is not associated with "main" function. It identifies the main source file (i.e. main_input_filename in GCC FE sources). For example,

-- a.c --
#include "a.h"
void foo() { ... }
-- a.h --
#include "b.h"
-- b.h ---
struct p {int x; int y;}
-- b.c ---
#include "b.h"
void bar() {...}
--

$ llvm-gcc -c -g a.c -o a.o
$ llvm-gcc -c -g b.c -o b.o

Now, a.o will have three compile units,  one for each input file 1) a.h 2) b.h and 3) a.c. Out of these only a.c's compile unit is marked as MainCU during code gen. 

b.o will have two compile units and b.c's compile unit is marked as MainCU during code gen.

In other words, each llvm module created by FE will have one compile unit marked as MainCU.

When one compile unit is marked as MainCU, the code generator will emit all DIEs into that compile unit. Otherwise code generator will emit multiple CUs in one .o file. In the second case, it is expected that the system linker will eliminate duplicate CUs from various .o files at link time. The darwin linker does not do this. This second path through code generator is not well tested yet.

Now, during LTO the code generator is seeing one big module, which is created after merging multiple modules created by FE. So LTO library needs to do one of the following while creating one big, optimized, object file (even if temp.).

1) If the platform linker can eliminate duplicate compile units from various incoming input files then clear MainCU bits from all incoming compile units.

2) Or, clear MainCU bits from all but accept one compile unit which is considered "main" for the combined and optimized object file.

On the darwin platform, we'll follow step 2). The gold linker plugin will also have to handle this appropriately.
Comment 3 Nick Lewycky 2009-02-09 19:05:55 PST
I'm still confused. Where did this notion of main compile unit come from? Is it defined in DWARF? Or is it just a GCC and LLVM convention?

Currently gold can't eliminate duplicate globals, but a patch exists and should be added soon. Is that what you're referring to when you write about "eliminate duplicate compile units" or are compile units some other construct in the .o file?

Sorry I'm not familiar with this stuff. I appreciate you taking the time to explain!
Comment 4 Devang Patel 2009-02-09 19:26:47 PST
I do not know what gold can do today or in future.

There are two kind of compile units 
  1) llvm IR  compile unit 
  2) DWARF compile unit.

The llvm IR uses the term compile_unit to refer an input file. 
(See http://llvm.org/docs/SourceLevelDebugging.html#format_compile_units). This means llvm IR contains multiple compile_unit in a module.  One of this compile_unit represents main input source file and it is marked as such.

Now, code generator produces compile units as specified by DWARF. It seems DWARF allows multiple DW_TAG_compile_unit (per function or per include header), so that it can be compressed later on. I am not a DWARF expert. 

---
Section 3.1.1 from DWARF says,

" In a compilation employing DWARF space compression and duplicate elimination techniques (see Appendix E), multiple compilation units using the tags DW_TAG_compile_unit and/or DW_TAG_partial_unit 
are used to represent portions of an object file."

And there is "E.4.3 Single-function-per-DWARF-compilation-unit".
---

So... the llvm code generator is responsible to take care of incoming multiple LLVM IR level compile units and generate appropriate number of DWARF compile units. 

If there is a many to many (LLVM IR -> DWARF) mapping then none of the LLVM IR level compile units should be marked as MainCU. 

If there is a many to one (LLVM IR -> DWARF) mapping then only one LLVM IR level compile unit should be marked as MainCU.

Comment 5 Duncan Sands 2009-02-10 07:44:00 PST
If I understand right, the concept of a main compile unit
is an LLVM workaround for a weakness of the darwin linker?
Does anyone know what gcc-4.4 does (it has some support
for LTO)?
Comment 6 Devang Patel 2009-02-10 11:26:49 PST
No, you're mistaken. It is not a work around for a weakness of the darwin linker.


Comment 7 Nick Lewycky 2009-02-10 12:05:46 PST
Yesterday I asked Cary, a gold developer, about this PR and he agrees that option #2 is the right approach for gold as well.

He also said that option #1 might not work in general. If you have a case where you've inlined function C (from compile unit C) into A and B (in compile units A and B respectively), and the debug info for C is modified in two different ways as a result of simplification from inlining, you'd get two (COMDAT?) sections that aren't mergable.
Comment 8 Duncan Sands 2009-02-10 14:33:04 PST
OK, I guess I just don't understand what the "main compile unit"
is good for.
Comment 9 Duncan Sands 2009-02-11 14:57:22 PST
I hoped to get some insight into this using gcc's -combine option,
but it seems it only really compiles one file, and only uses info from
the others for extra simplifications.  In any case, I couldn't find
anything interesting in the dwarf output.
Comment 10 Luke Dalessandro 2009-03-16 08:25:31 PDT
(In reply to comment #2)
> 2) Or, clear MainCU bits from all but accept one compile unit which is
> considered "main" for the combined and optimized object file.

Hi Devang,

As a workaround, is there any way to do this manually in the large .ll file that I've created? i.e. what exactly am I looking for in the llvm.dbg intrinsics that marks a compile unit as a "main" compile unit? 

I don't mind doing a little processing to go through and strip this from all-but-one unit. Given that this is possible, is there some "preferred" main compile unit (ie, should I just associate the main tag with whichever unit "main" comes from)?

Luke

Comment 11 Devang Patel 2009-03-16 13:47:30 PDT
Luke,

@llvm.dbg.compile_unit = internal constant %llvm.dbg.compile_unit.type {
i32 458769, 
%0* bitcast (%llvm.dbg.anchor.type* @llvm.dbg.compile_units to %0*),
i32 1, 
i8* getelementptr ([4 x i8]* @.str, i32 0, i32 0), 
i8* getelementptr ([5 x i8]* @.str1, i32 0, i32 0), 
i8* getelementptr ([52 x i8]* @.str2, i32 0, i32 0),  
i1 true, ; <--- true if this is a "main" compile unit, false otherwise.
i1 false, 
i8* null, 
i32 0 }, section "llvm.metadata"                ; <%llvm.dbg.compile_unit.type*> [#uses=1]


You can use DICompileUnit() from DebugInfo.h to access this info. It is easy to add setMainCU() to set this bit. 

There is not any "preferred" main compile unit. So using the one that holds "int main()" functions is reasonable.
Comment 12 Devang Patel 2009-04-09 16:09:54 PDT
*** Bug 3718 has been marked as a duplicate of this bug. ***
Comment 13 devang.patel 2009-06-30 16:06:09 PDT
Is this still failing after rev. 74449 ?
Comment 14 Haohui Mai 2009-06-30 22:26:49 PDT
Works for me.

I tested on a way bigger program.
Comment 15 devang.patel 2009-07-01 12:58:36 PDT
Thanks!
Comment 16 Luke Dalessandro 2009-07-01 14:09:02 PDT
(In reply to comment #13)
> Is this still failing after rev. 74449 ?
> 

Works here too. Thank.

Luke