Created attachment 14307 [details] header compiling the attached sources on Windows x64 using clang++ l1.cpp l2.cpp -target i686-pc-windows-gnu results in multiple definition of `guard variable for lanczos_initializer<int>::initializer' where as g++ is ok: g++ l1.cpp l2.cpp the object files shows comdats with g++ but not with clang: >g++ l1.cpp -c & nm l1.o 00000000 b .bss 00000000 d .ctors 00000000 d .data 00000000 d .data$_ZGVN19lanczos_initializerIiE11initializerE 00000000 d .data$_ZN19lanczos_initializerIiE11initializerE 00000000 r .eh_frame 00000000 r .eh_frame$_ZN4initC1Ev 00000000 r .rdata$zzz 00000000 t .text 00000000 t .text$_ZN4initC1Ev 00000037 t __GLOBAL__sub_I__Z2l1v 00000000 T __Z2l1v 00000005 t __Z41__static_initialization_and_destruction_0ii 00000000 D __ZGVN19lanczos_initializerIiE11initializerE 00000000 D __ZN19lanczos_initializerIiE11initializerE 00000000 T __ZN4initC1Ev >clang++ l1.cpp -c -target i686-pc-windows-gnu & nm l1.o 00000000 b .bss 00000000 b .bss 00000000 b .bss 00000000 d .ctors 00000000 d .data 00000000 d .eh_frame 00000000 t .text 00000000 t .text 00000000 t .text 00000001 a @feat.00 00000000 t ___cxx_global_var_init 00000000 T __Z2l1v 00000000 B __ZGVN19lanczos_initializerIiE11initializerE 00000000 B __ZN19lanczos_initializerIiE11initializerE 00000000 T __ZN4initC2Ev
Created attachment 14308 [details] l1.cpp
Created attachment 14309 [details] l2.cpp
The comdat isn't missing, the guard variable is just associated with the global variable being guarded. This is supposed to do the right thing, and is consistent with what we do on Linux. BFD ld might be doing the wrong thing, but MSVC's linker also agrees that something is wrong: $ clang++ l1.cpp --target=x86_64-windows-gnu -o l1.obj -c $ clang++ l2.cpp --target=x86_64-windows-gnu -o l2.obj -c $ link -nologo l1.obj l2.obj -out:t.exe && ./t.exe l1.obj : fatal error LNK1243: invalid or corrupt file: COMDAT section 0x4 associated with following section 0x6 I'm going to try reordering the sections by hacking the assembly, and we'll see what it says.
Yeah, after rearranging things in the .s files to bypass that link.exe error, we have this problem: $ link -nologo -defaultlib:libcmt l1.obj l2.obj -out:t.exe && ./t.exe l2.obj : error LNK2005: _ZGVN19lanczos_initializerIiE11initializerE already defined in l1.obj l2.obj : error LNK2019: unresolved external symbol __main referenced in function main t.exe : fatal error LNK1120: 1 unresolved externals The __main error is probably just a missing mingw64 library, but the other issue is real.
This code is heavily reduced boost/math/special_functions/lanczos.hpp. Thanks for looking into it!
The issue seems to resolve around ItaniumCXXABI::EmitGuardedInit if (!D.isLocalVarDecl() && C) { guard->setComdat(C); CGF.CurFn->setComdat(C); } else if (CGM.supportsCOMDAT() && guard->isWeakForLinker()) { guard->setComdat(CGM.getModule().getOrInsertComdat(guard->getName())); } when the first branch is taken, we have multiple definition error with ld and LNK1243 error with link.exe. If we force the second branch (obviously this is not the correct conditional), if (false && !D.isLocalVarDecl() && C) { guard->setComdat(C); CGF.CurFn->setComdat(C); } else if (CGM.supportsCOMDAT() && guard->isWeakForLinker()) { guard->setComdat(CGM.getModule().getOrInsertComdat(guard->getName())); } both ld and link.exe link the resulting objects without errors.
Fundamentally, this is a failure to understand the limitations of comdats on our part. I'd really like to understand the difference in semantics between ELF and COFF here before we close this. In the meantime, we can disable that condition for COFF.
Rui, are we doing something wrong for associative comdats in here?
Sounds like we should disable the optimization and fix the boost build?
Looks like clang-cl++ makes `_ZGVN19lanczos_initializerIiE11initializerE` as an external symbol, but MSVC created symbols in associative sections as static. Maybe we should follow what MSVC does? As to LLD, we don't add symbols in associative sections to the symbol table, so such symbols will never conflict even if they are external. We do that because I think that's the right thing to do. But this seems to be different from what MSVC linker does -- it looks like they just insert all symbols to the symbol table as long as they are external. That describes why LLD does not report an error for this test case. I think we should modify LLD to make it compatible with MSVC.
Not adding the guard variable to the symbol table in this case would give the wrong result, so I think we need to remove this optimization. It seems pretty clear to me now that the MSVC linking algorithm resolves all external symbols up front without considering comdat associativity, and then brings some extra data along for the ride if your associated comdat was chosen. We can't put two external symbols in the same group in this model. It's not clear to me if it's working properly for ELF either, we might just be getting lucky. If the linker changes to be more strict, our object files might become unlinkable.
As of r245812 this works.
I don't see any change that would have fixed this, though. Are we sure everything works correctly?
You are right, I tested on a local patched copy that disables on COFF as dicussed earlier: // The ABI says: It is suggested that it be emitted in the same COMDAT group // as the associated data object llvm::Comdat *C = var->getComdat(); - if (!D.isLocalVarDecl() && C) { + if (!CGM.getTarget().getTriple().isOSBinFormatCOFF() && + !D.isLocalVarDecl() && C) { guard->setComdat(C); CGF.CurFn->setComdat(C); } else if (CGM.supportsCOMDAT() && guard->isWeakForLinker()) { Thanks for noticing!
Will this patch by any chance make it into Clang 3.7? My project currently doesn't build using clang on windows. I'm not sure if this issue will arise when cross compiling from Linux, but I currently use GCC for that.
Do you think this patch is the right way to go? If so, I'll post it on cfe-dev so others not following this bug may comment.
(In reply to comment #16) > Do you think this patch is the right way to go? > If so, I'll post it on cfe-dev so others not following this bug may comment. Sure. I'm concerned that this optimization doesn't work on ELF either, though, and that maybe we shouldn't be doing it in the first place.
(In reply to comment #17) > (In reply to comment #16) > > Do you think this patch is the right way to go? > > If so, I'll post it on cfe-dev so others not following this bug may comment. > > Sure. I'm concerned that this optimization doesn't work on ELF either, > though, and that maybe we shouldn't be doing it in the first place. Putting multiple global symbols in a comdat? That is fine on ELF. You just have to be sure that the same symbols are added in every TU.
(In reply to comment #18) > (In reply to comment #17) > > (In reply to comment #16) > > > Do you think this patch is the right way to go? > > > If so, I'll post it on cfe-dev so others not following this bug may comment. > > > > Sure. I'm concerned that this optimization doesn't work on ELF either, > > though, and that maybe we shouldn't be doing it in the first place. > > Putting multiple global symbols in a comdat? That is fine on ELF. You just > have to be sure that the same symbols are added in every TU. GCC doesn't put the guard variable into the comdat of the global variable. Isn't that a problem?
(In reply to comment #19) > (In reply to comment #18) > > (In reply to comment #17) > > > (In reply to comment #16) > > > > Do you think this patch is the right way to go? > > > > If so, I'll post it on cfe-dev so others not following this bug may comment. > > > > > > Sure. I'm concerned that this optimization doesn't work on ELF either, > > > though, and that maybe we shouldn't be doing it in the first place. > > > > Putting multiple global symbols in a comdat? That is fine on ELF. You just > > have to be sure that the same symbols are added in every TU. > > GCC doesn't put the guard variable into the comdat of the global variable. > Isn't that a problem? The itanium c++ abi says Some objects with static storage duration have associated guard variables used to ensure that they are initialized only once (see 3.3.2). If the object is emitted using a COMDAT group, the guard variable must be too. It is suggested that it be emitted in the same COMDAT group as the associated data object, but it may be emitted in its own COMDAT group, identified by its name. In either case, it must be weak. I remember discussing with someone (Richard maybe?) why the suggestion was not ABI breaking, but I can't remember right now :-(
I've convinced myself it works for ELF because compilers have always defined both the global *and* the guard variable in the same TU. You only end up in the bad situation where you get two copies of a comdat member when one TU provides some of the external symbols of a comdat group and not the rest. With that in mind, I think the right fix is to make this whole optimization conditional on isObjFormatELF() instead of !COFF and !MachO. :)
r246803