Created attachment 11080 [details] double initialization test case The attached test case fails when built with and without -DCONFIG_1 while mixing compilers. The Microsoft ABI does *not* use guard variables to avoid double initialization of linkonce_odr data, but my fix for PR16888 does. What's supposed to happen is something like the following: --- .CRT$XCU is a section similar to .init_array on Linux / ELF, an array of void function pointers. Each TU with a weak initializer emits two symbols: void ("?__E" <data>)(void) void (* <data> "$initializer$")(void) The __E symbol is the initializer code stub goes into the usual COMDAT selectany text section, the way we would emit any inline function. The $initializer$ symbol goes into a .CRT$XCU COMDAT section with the IMAGE_COMDAT_SELECT_ASSOCIATIVE flag, where it is associated with the __E symbol. This way, the linker picks one __E symbol, and then throws away all the associated function pointers that would have referenced it, thereby leaving only one call to the initializer. --- Currently clang orders initializers within a TU by creating one giant function (__GLOBAL_I_a etc) that calls each initializer stub in turn. This means that we can't perform COMDAT elimination on the function pointers in .CRT$XCU. The LLVM LangRef says that initializers in llvm.global_ctors with the same priority are not ordered, but I wonder if we could loosen that initializers with the same priority in a single TU are called in the order defined by the global_ctors array. With that change, I don't think clang would need to emit __GLOBAL_I_* functions. Are there any known object file formats or platforms where the linker would violate that guarantee?
We can implement priorities by appending the priority to the section name, similar to how ELF does it. The COFF scheme is supposed to look like this: .CRT$XCA // marker for array start .CRT$XCU // default ctors .CRT$XCZ // marker for array end It's explained by http://msdn.microsoft.com/en-us/library/bb918180.aspx . Everything after the $ is sorted and concatenated by the linker. We should be able to add sections like: .CRT$XCT000001 .CRT$XCT065535 These will sort prior to the default ctor priority which will go into .CRT$XCU.
After discussing this with Richard, it seems we don't want to guarantee order of execution of global_ctors because GlobalOpt will attempt to symbolically execute each initializer in turn, out of order. By batching up all the initializers in a TU into a _GLOBAL__I_ stub, clang ensures that they either all execute in order, or none at all. Instead, perhaps I'm looking into this: - Split out initializers for static data members of class template specializations into their own global_ctors entries. The language does not guarantee the order of initialization of these. This will allow GlobalOpt to fire more often for Itanium. - Steal a priority bit to indicate if an initializer is "unique", meaning should only run once, and then fix LTO global_ctors merging to dedupe unique initializers. - LLVM CodeGen sees the unique bit and emits a IMAGE_COMDAT_SELECT_ASSOCIATIVE .CRT$XCU section associated with the initializer. - GlobalOpt may have to skip symbolic execution of unique initializers, check what cl.exe does - GlobalOpt shouldn't be allowed to modify data unless it knows it has the only definition. Check that this is true.
It turns out that the function pointer going into .CRT$XCU is actually associated with the *data*, not the initializer (__E). We don't put the data in global_ctors, so I'll have to widen the struct to be 2 or 3 elements where the third is optional. The third is a pointer to an associated GlobalValue.
I hit this recently, so I'm going to dust off my old patch for it.
(fixed in r209555?)
Yep! r209555. :)