Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing comdats for static initializers in template and guard variables #23846

Closed
llvmbot opened this issue May 11, 2015 · 20 comments
Closed

missing comdats for static initializers in template and guard variables #23846

llvmbot opened this issue May 11, 2015 · 20 comments
Labels
bugzilla Issues migrated from bugzilla

Comments

@llvmbot
Copy link
Collaborator

llvmbot commented May 11, 2015

Bugzilla Link 23472
Resolution FIXED
Resolved on Sep 03, 2015 15:47
Version trunk
OS Windows NT
Attachments header, l1.cpp, l2.cpp
Reporter LLVM Bugzilla Contributor
CC @dschuff,@Ivan171,@zygoloid,@rnk,@rui314

Extended Description

compiling the attached sources on Windows x64 using

clang++ l1.cpp l2.cpp -target i686-pc-windows-gnu

results in

multiple definition of `guard variable for lanczos_initializer::initializer'

where as g++ is ok:

g++ l1.cpp l2.cpp

the object files shows comdats with g++ but not with clang:

g++ l1.cpp -c & nm l1.o
00000000 b .bss
00000000 d .ctors
00000000 d .data
00000000 d .data$_ZGVN19lanczos_initializerIiE11initializerE
00000000 d .data$_ZN19lanczos_initializerIiE11initializerE
00000000 r .eh_frame
00000000 r .eh_frame$_ZN4initC1Ev
00000000 r .rdata$zzz
00000000 t .text
00000000 t .text$_ZN4initC1Ev
00000037 t __GLOBAL__sub_I__Z2l1v
00000000 T __Z2l1v
00000005 t __Z41__static_initialization_and_destruction_0ii
00000000 D __ZGVN19lanczos_initializerIiE11initializerE
00000000 D __ZN19lanczos_initializerIiE11initializerE
00000000 T __ZN4initC1Ev

clang++ l1.cpp -c -target i686-pc-windows-gnu & nm l1.o
00000000 b .bss
00000000 b .bss
00000000 b .bss
00000000 d .ctors
00000000 d .data
00000000 d .eh_frame
00000000 t .text
00000000 t .text
00000000 t .text
00000001 a @​feat.00
00000000 t ___cxx_global_var_init
00000000 T __Z2l1v
00000000 B __ZGVN19lanczos_initializerIiE11initializerE
00000000 B __ZN19lanczos_initializerIiE11initializerE
00000000 T __ZN4initC2Ev

@rnk
Copy link
Collaborator

rnk commented May 11, 2015

The comdat isn't missing, the guard variable is just associated with the global variable being guarded. This is supposed to do the right thing, and is consistent with what we do on Linux. BFD ld might be doing the wrong thing, but MSVC's linker also agrees that something is wrong:

$ clang++ l1.cpp --target=x86_64-windows-gnu -o l1.obj -c
$ clang++ l2.cpp --target=x86_64-windows-gnu -o l2.obj -c
$ link -nologo l1.obj l2.obj -out:t.exe && ./t.exe
l1.obj : fatal error LNK1243: invalid or corrupt file: COMDAT section 0x4 associated with following section 0x6

I'm going to try reordering the sections by hacking the assembly, and we'll see what it says.

@rnk
Copy link
Collaborator

rnk commented May 11, 2015

Yeah, after rearranging things in the .s files to bypass that link.exe error, we have this problem:
$ link -nologo -defaultlib:libcmt l1.obj l2.obj -out:t.exe && ./t.exe
l2.obj : error LNK2005: _ZGVN19lanczos_initializerIiE11initializerE already defined in l1.obj
l2.obj : error LNK2019: unresolved external symbol __main referenced in function main
t.exe : fatal error LNK1120: 1 unresolved externals

The __main error is probably just a missing mingw64 library, but the other issue is real.

@llvmbot
Copy link
Collaborator Author

llvmbot commented May 12, 2015

This code is heavily reduced boost/math/special_functions/lanczos.hpp.

Thanks for looking into it!

@llvmbot
Copy link
Collaborator Author

llvmbot commented May 18, 2015

The issue seems to resolve around ItaniumCXXABI::EmitGuardedInit

if (!D.isLocalVarDecl() && C) {
  guard->setComdat(C);
  CGF.CurFn->setComdat(C);
} else if (CGM.supportsCOMDAT() && guard->isWeakForLinker()) {
  guard->setComdat(CGM.getModule().getOrInsertComdat(guard->getName()));
}

when the first branch is taken, we have multiple definition error with ld and LNK1243 error with link.exe.

If we force the second branch (obviously this is not the correct conditional),

if (false && !D.isLocalVarDecl() && C) {
  guard->setComdat(C);
  CGF.CurFn->setComdat(C);
} else if (CGM.supportsCOMDAT() && guard->isWeakForLinker()) {
  guard->setComdat(CGM.getModule().getOrInsertComdat(guard->getName()));
}

both ld and link.exe link the resulting objects without errors.

@rnk
Copy link
Collaborator

rnk commented May 18, 2015

Fundamentally, this is a failure to understand the limitations of comdats on our part. I'd really like to understand the difference in semantics between ELF and COFF here before we close this. In the meantime, we can disable that condition for COFF.

@llvmbot
Copy link
Collaborator Author

llvmbot commented Jun 19, 2015

Rui, are we doing something wrong for associative comdats in here?

@rnk
Copy link
Collaborator

rnk commented Jul 22, 2015

Sounds like we should disable the optimization and fix the boost build?

@rui314
Copy link
Member

rui314 commented Jul 23, 2015

Looks like clang-cl++ makes _ZGVN19lanczos_initializerIiE11initializerE as an external symbol, but MSVC created symbols in associative sections as static. Maybe we should follow what MSVC does?

As to LLD, we don't add symbols in associative sections to the symbol table, so such symbols will never conflict even if they are external. We do that because I think that's the right thing to do. But this seems to be different from what MSVC linker does -- it looks like they just insert all symbols to the symbol table as long as they are external. That describes why LLD does not report an error for this test case. I think we should modify LLD to make it compatible with MSVC.

@rnk
Copy link
Collaborator

rnk commented Jul 23, 2015

Not adding the guard variable to the symbol table in this case would give the wrong result, so I think we need to remove this optimization. It seems pretty clear to me now that the MSVC linking algorithm resolves all external symbols up front without considering comdat associativity, and then brings some extra data along for the ride if your associated comdat was chosen. We can't put two external symbols in the same group in this model.

It's not clear to me if it's working properly for ELF either, we might just be getting lucky. If the linker changes to be more strict, our object files might become unlinkable.

@llvmbot
Copy link
Collaborator Author

llvmbot commented Aug 27, 2015

As of r245812 this works.

@rnk
Copy link
Collaborator

rnk commented Aug 28, 2015

I don't see any change that would have fixed this, though. Are we sure everything works correctly?

@llvmbot
Copy link
Collaborator Author

llvmbot commented Aug 28, 2015

You are right, I tested on a local patched copy that disables on COFF as dicussed earlier:

 // The ABI says: It is suggested that it be emitted in the same COMDAT group
 // as the associated data object
 llvm::Comdat *C = var->getComdat();
  • if (!D.isLocalVarDecl() && C) {
  • if (!CGM.getTarget().getTriple().isOSBinFormatCOFF() &&
  •    !D.isLocalVarDecl() && C) {
     guard->setComdat(C);
     CGF.CurFn->setComdat(C);
    
    } else if (CGM.supportsCOMDAT() && guard->isWeakForLinker()) {

Thanks for noticing!

@llvmbot
Copy link
Collaborator Author

llvmbot commented Aug 28, 2015

Will this patch by any chance make it into Clang 3.7?
My project currently doesn't build using clang on windows.
I'm not sure if this issue will arise when cross compiling from Linux, but I currently use GCC for that.

@llvmbot
Copy link
Collaborator Author

llvmbot commented Aug 29, 2015

Do you think this patch is the right way to go?
If so, I'll post it on cfe-dev so others not following this bug may comment.

@rnk
Copy link
Collaborator

rnk commented Aug 31, 2015

Do you think this patch is the right way to go?
If so, I'll post it on cfe-dev so others not following this bug may comment.

Sure. I'm concerned that this optimization doesn't work on ELF either, though, and that maybe we shouldn't be doing it in the first place.

@llvmbot
Copy link
Collaborator Author

llvmbot commented Aug 31, 2015

Do you think this patch is the right way to go?
If so, I'll post it on cfe-dev so others not following this bug may comment.

Sure. I'm concerned that this optimization doesn't work on ELF either,
though, and that maybe we shouldn't be doing it in the first place.

Putting multiple global symbols in a comdat? That is fine on ELF. You just have to be sure that the same symbols are added in every TU.

@rnk
Copy link
Collaborator

rnk commented Aug 31, 2015

Do you think this patch is the right way to go?
If so, I'll post it on cfe-dev so others not following this bug may comment.

Sure. I'm concerned that this optimization doesn't work on ELF either,
though, and that maybe we shouldn't be doing it in the first place.

Putting multiple global symbols in a comdat? That is fine on ELF. You just
have to be sure that the same symbols are added in every TU.

GCC doesn't put the guard variable into the comdat of the global variable. Isn't that a problem?

@llvmbot
Copy link
Collaborator Author

llvmbot commented Sep 1, 2015

Do you think this patch is the right way to go?
If so, I'll post it on cfe-dev so others not following this bug may comment.

Sure. I'm concerned that this optimization doesn't work on ELF either,
though, and that maybe we shouldn't be doing it in the first place.

Putting multiple global symbols in a comdat? That is fine on ELF. You just
have to be sure that the same symbols are added in every TU.

GCC doesn't put the guard variable into the comdat of the global variable.
Isn't that a problem?

The itanium c++ abi says

Some objects with static storage duration have associated guard variables used to ensure that they are initialized only once (see 3.3.2). If the object is emitted using a COMDAT group, the guard variable must be too. It is suggested that it be emitted in the same COMDAT group as the associated data object, but it may be emitted in its own COMDAT group, identified by its name. In either case, it must be weak.

I remember discussing with someone (Richard maybe?) why the suggestion was not ABI breaking, but I can't remember right now :-(

@rnk
Copy link
Collaborator

rnk commented Sep 2, 2015

I've convinced myself it works for ELF because compilers have always defined both the global and the guard variable in the same TU. You only end up in the bad situation where you get two copies of a comdat member when one TU provides some of the external symbols of a comdat group and not the rest.

With that in mind, I think the right fix is to make this whole optimization conditional on isObjFormatELF() instead of !COFF and !MachO. :)

@llvmbot
Copy link
Collaborator Author

llvmbot commented Sep 3, 2015

r246803

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

3 participants