LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 23472 - missing comdats for static initializers in template and guard variables
Summary: missing comdats for static initializers in template and guard variables
Status: RESOLVED FIXED
Alias: None
Product: new-bugs
Classification: Unclassified
Component: new bugs (show other bugs)
Version: trunk
Hardware: PC Windows NT
: P normal
Assignee: Rafael Ávila de Espíndola
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-05-11 03:15 PDT by Yaron Keren
Modified: 2015-09-03 15:47 PDT (History)
9 users (show)

See Also:
Fixed By Commit(s):


Attachments
header (193 bytes, text/plain)
2015-05-11 03:15 PDT, Yaron Keren
Details
l1.cpp (80 bytes, text/plain)
2015-05-11 03:15 PDT, Yaron Keren
Details
l2.cpp (81 bytes, text/plain)
2015-05-11 03:16 PDT, Yaron Keren
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yaron Keren 2015-05-11 03:15:30 PDT
Created attachment 14307 [details]
header

compiling the attached sources on Windows x64 using

 clang++ l1.cpp l2.cpp -target i686-pc-windows-gnu

results in

 multiple definition of `guard variable for lanczos_initializer<int>::initializer'

where as g++ is ok:

 g++ l1.cpp l2.cpp

the object files shows comdats with g++ but not with clang:

 >g++ l1.cpp -c & nm l1.o
00000000 b .bss
00000000 d .ctors
00000000 d .data
00000000 d .data$_ZGVN19lanczos_initializerIiE11initializerE
00000000 d .data$_ZN19lanczos_initializerIiE11initializerE
00000000 r .eh_frame
00000000 r .eh_frame$_ZN4initC1Ev
00000000 r .rdata$zzz
00000000 t .text
00000000 t .text$_ZN4initC1Ev
00000037 t __GLOBAL__sub_I__Z2l1v
00000000 T __Z2l1v
00000005 t __Z41__static_initialization_and_destruction_0ii
00000000 D __ZGVN19lanczos_initializerIiE11initializerE
00000000 D __ZN19lanczos_initializerIiE11initializerE
00000000 T __ZN4initC1Ev


 >clang++ l1.cpp -c -target i686-pc-windows-gnu & nm l1.o
00000000 b .bss
00000000 b .bss
00000000 b .bss
00000000 d .ctors
00000000 d .data
00000000 d .eh_frame
00000000 t .text
00000000 t .text
00000000 t .text
00000001 a @feat.00
00000000 t ___cxx_global_var_init
00000000 T __Z2l1v
00000000 B __ZGVN19lanczos_initializerIiE11initializerE
00000000 B __ZN19lanczos_initializerIiE11initializerE
00000000 T __ZN4initC2Ev
Comment 1 Yaron Keren 2015-05-11 03:15:53 PDT
Created attachment 14308 [details]
l1.cpp
Comment 2 Yaron Keren 2015-05-11 03:16:11 PDT
Created attachment 14309 [details]
l2.cpp
Comment 3 Reid Kleckner 2015-05-11 15:19:18 PDT
The comdat isn't missing, the guard variable is just associated with the global variable being guarded. This is supposed to do the right thing, and is consistent with what we do on Linux. BFD ld might be doing the wrong thing, but MSVC's linker also agrees that something is wrong:

$ clang++ l1.cpp --target=x86_64-windows-gnu -o l1.obj -c
$ clang++ l2.cpp --target=x86_64-windows-gnu -o l2.obj -c
$ link -nologo l1.obj l2.obj -out:t.exe && ./t.exe
l1.obj : fatal error LNK1243: invalid or corrupt file: COMDAT section 0x4 associated with following section 0x6

I'm going to try reordering the sections by hacking the assembly, and we'll see what it says.
Comment 4 Reid Kleckner 2015-05-11 15:24:16 PDT
Yeah, after rearranging things in the .s files to bypass that link.exe error, we have this problem:
$ link -nologo -defaultlib:libcmt l1.obj l2.obj -out:t.exe && ./t.exe
l2.obj : error LNK2005: _ZGVN19lanczos_initializerIiE11initializerE already defined in l1.obj
l2.obj : error LNK2019: unresolved external symbol __main referenced in function main
t.exe : fatal error LNK1120: 1 unresolved externals

The __main error is probably just a missing mingw64 library, but the other issue is real.
Comment 5 Yaron Keren 2015-05-12 01:15:25 PDT
This code is heavily reduced boost/math/special_functions/lanczos.hpp.

Thanks for looking into it!
Comment 6 Yaron Keren 2015-05-18 08:05:55 PDT
The issue seems to resolve around ItaniumCXXABI::EmitGuardedInit

    if (!D.isLocalVarDecl() && C) {
      guard->setComdat(C);
      CGF.CurFn->setComdat(C);
    } else if (CGM.supportsCOMDAT() && guard->isWeakForLinker()) {
      guard->setComdat(CGM.getModule().getOrInsertComdat(guard->getName()));
    }

when the first branch is taken, we have multiple definition error with ld and LNK1243 error with link.exe.

If we force the second branch (obviously this is not the correct conditional),

    if (false && !D.isLocalVarDecl() && C) {
      guard->setComdat(C);
      CGF.CurFn->setComdat(C);
    } else if (CGM.supportsCOMDAT() && guard->isWeakForLinker()) {
      guard->setComdat(CGM.getModule().getOrInsertComdat(guard->getName()));
    }

both ld and link.exe link the resulting objects without errors.
Comment 7 Reid Kleckner 2015-05-18 12:42:39 PDT
Fundamentally, this is a failure to understand the limitations of comdats on our part. I'd really like to understand the difference in semantics between ELF and COFF here before we close this. In the meantime, we can disable that condition for COFF.
Comment 8 Rafael Ávila de Espíndola 2015-06-19 07:49:51 PDT
Rui, are we doing something wrong for associative comdats in here?
Comment 9 Reid Kleckner 2015-07-22 16:32:07 PDT
Sounds like we should disable the optimization and fix the boost build?
Comment 10 Rui Ueyama 2015-07-22 18:46:19 PDT
Looks like clang-cl++ makes `_ZGVN19lanczos_initializerIiE11initializerE` as an external symbol, but MSVC created symbols in associative sections as static. Maybe we should follow what MSVC does?

As to LLD, we don't add symbols in associative sections to the symbol table, so such symbols will never conflict even if they are external. We do that because I think that's the right thing to do. But this seems to be different from what MSVC linker does -- it looks like they just insert all symbols to the symbol table as long as they are external. That describes why LLD does not report an error for this test case. I think we should modify LLD to make it compatible with MSVC.
Comment 11 Reid Kleckner 2015-07-23 11:44:22 PDT
Not adding the guard variable to the symbol table in this case would give the wrong result, so I think we need to remove this optimization. It seems pretty clear to me now that the MSVC linking algorithm resolves all external symbols up front without considering comdat associativity, and then brings some extra data along for the ride if your associated comdat was chosen. We can't put two external symbols in the same group in this model.

It's not clear to me if it's working properly for ELF either, we might just be getting lucky. If the linker changes to be more strict, our object files might become unlinkable.
Comment 12 Yaron Keren 2015-08-27 00:01:50 PDT
As of r245812 this works.
Comment 13 Reid Kleckner 2015-08-28 12:12:34 PDT
I don't see any change that would have fixed this, though. Are we sure everything works correctly?
Comment 14 Yaron Keren 2015-08-28 12:42:10 PDT
You are right, I tested on a local patched copy that disables on COFF as dicussed earlier:

     // The ABI says: It is suggested that it be emitted in the same COMDAT group
     // as the associated data object
     llvm::Comdat *C = var->getComdat();
-    if (!D.isLocalVarDecl() && C) {
+    if (!CGM.getTarget().getTriple().isOSBinFormatCOFF() &&
+        !D.isLocalVarDecl() && C) {
       guard->setComdat(C);
       CGF.CurFn->setComdat(C);
     } else if (CGM.supportsCOMDAT() && guard->isWeakForLinker()) {

Thanks for noticing!
Comment 15 rafael_andreas 2015-08-28 14:58:40 PDT
Will this patch by any chance make it into Clang 3.7?
My project currently doesn't build using clang on windows.
I'm not sure if this issue will arise when cross compiling from Linux, but I currently use GCC for that.
Comment 16 Yaron Keren 2015-08-29 14:33:25 PDT
Do you think this patch is the right way to go?
If so, I'll post it on cfe-dev so others not following this bug may comment.
Comment 17 Reid Kleckner 2015-08-31 15:29:14 PDT
(In reply to comment #16)
> Do you think this patch is the right way to go?
> If so, I'll post it on cfe-dev so others not following this bug may comment.

Sure. I'm concerned that this optimization doesn't work on ELF either, though, and that maybe we shouldn't be doing it in the first place.
Comment 18 Rafael Ávila de Espíndola 2015-08-31 16:15:42 PDT
(In reply to comment #17)
> (In reply to comment #16)
> > Do you think this patch is the right way to go?
> > If so, I'll post it on cfe-dev so others not following this bug may comment.
> 
> Sure. I'm concerned that this optimization doesn't work on ELF either,
> though, and that maybe we shouldn't be doing it in the first place.

Putting multiple global symbols in a comdat? That is fine on ELF. You just have to be sure that the same symbols are added in every TU.
Comment 19 Reid Kleckner 2015-08-31 16:20:57 PDT
(In reply to comment #18)
> (In reply to comment #17)
> > (In reply to comment #16)
> > > Do you think this patch is the right way to go?
> > > If so, I'll post it on cfe-dev so others not following this bug may comment.
> > 
> > Sure. I'm concerned that this optimization doesn't work on ELF either,
> > though, and that maybe we shouldn't be doing it in the first place.
> 
> Putting multiple global symbols in a comdat? That is fine on ELF. You just
> have to be sure that the same symbols are added in every TU.

GCC doesn't put the guard variable into the comdat of the global variable. Isn't that a problem?
Comment 20 Rafael Ávila de Espíndola 2015-08-31 18:26:27 PDT
(In reply to comment #19)
> (In reply to comment #18)
> > (In reply to comment #17)
> > > (In reply to comment #16)
> > > > Do you think this patch is the right way to go?
> > > > If so, I'll post it on cfe-dev so others not following this bug may comment.
> > > 
> > > Sure. I'm concerned that this optimization doesn't work on ELF either,
> > > though, and that maybe we shouldn't be doing it in the first place.
> > 
> > Putting multiple global symbols in a comdat? That is fine on ELF. You just
> > have to be sure that the same symbols are added in every TU.
> 
> GCC doesn't put the guard variable into the comdat of the global variable.
> Isn't that a problem?

The itanium c++ abi says

Some objects with static storage duration have associated guard variables used to ensure that they are initialized only once (see 3.3.2). If the object is emitted using a COMDAT group, the guard variable must be too. It is suggested that it be emitted in the same COMDAT group as the associated data object, but it may be emitted in its own COMDAT group, identified by its name. In either case, it must be weak.

I remember discussing with someone (Richard maybe?) why the suggestion was not ABI breaking, but I can't remember right now :-(
Comment 21 Reid Kleckner 2015-09-02 13:00:11 PDT
I've convinced myself it works for ELF because compilers have always defined both the global *and* the guard variable in the same TU. You only end up in the bad situation where you get two copies of a comdat member when one TU provides some of the external symbols of a comdat group and not the rest.

With that in mind, I think the right fix is to make this whole optimization conditional on isObjFormatELF() instead of !COFF and !MachO. :)
Comment 22 Yaron Keren 2015-09-03 15:47:50 PDT
r246803