LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 11944 - Static constructors should be purged from LLVM
Summary: Static constructors should be purged from LLVM
Status: NEW
Alias: None
Product: libraries
Classification: Unclassified
Component: Core LLVM classes (show other bugs)
Version: 1.0
Hardware: PC All
: P enhancement
Assignee: Unassigned LLVM Bugs
URL:
Keywords: code-cleanup
Depends on:
Blocks:
 
Reported: 2012-02-07 19:42 PST by Chris Lattner
Modified: 2019-05-20 11:07 PDT (History)
12 users (show)

See Also:
Fixed By Commit(s):


Attachments
Current -Wglobal-constructors violations (583.85 KB, text/plain)
2012-02-08 13:48 PST, David Blaikie
Details
Current -Wglobal-constructors violations (667.04 KB, text/plain)
2012-02-09 13:28 PST, David Blaikie
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Lattner 2012-02-07 19:42:59 PST
See http://llvm.org/docs/CodingStandards.html#ci_static_ctors.  We should really purge all the remaining static constructors and destructors, then turn on the clang -Wglobal-constructors warning flag to prevent regressing in the future.
Comment 1 Benjamin Kramer 2012-02-08 07:28:44 PST
The biggest static ctor contributor in LLVM is the cl::opt command line interface. If we want to remove static ctors from LLVM we will need a replacement for that library.
Comment 2 David Blaikie 2012-02-08 13:10:36 PST
> The biggest static ctor contributor in LLVM is the cl::opt command line
> interface. If we want to remove static ctors from LLVM we will need a
> replacement for that library.

Any idea how we do that? Given that cl::opt is a registration system so that parsing command line arguments can populate/use those registrations - the trick (trivial type for the global, converting ctor for the 'real' type) done with the global Attributes doesn't apply here - because without a real ctor the registration would not occur.

The only other trick I know of is pinning data into particular linker sections that could then be walked at runtime - it'd remove the static ctor but I've only ever done something like that on Windows & not sure if/how we'd do it in a portable way for LLVM.

Other ideas/thoughts?
Comment 3 Chris Lattner 2012-02-08 13:42:38 PST
There are a couple of ways to handle cl::opt, but they're pretty ugly and will require some widespread API changes.  That's probably ok though, because cl::opt really needs a redesign for performance and functionality anyway.

Anyway, I'd suggest starting with everything that is *not* cl::opt.  Last I looked, tablegen was emitting some very large arrays that had static constructors into each target.  There are probably a bunch of other similar accidental ones.
Comment 4 David Blaikie 2012-02-08 13:48:02 PST
Created attachment 8021 [details]
Current -Wglobal-constructors violations

> Anyway, I'd suggest starting with everything that is *not* cl::opt.

Hmm - was hoping to ensure we had a plan/something that works for the hard case before worrying about the mechanical/relatively trivial stuff.

> Last I
> looked, tablegen was emitting some very large arrays that had static
> constructors into each target.  There are probably a bunch of other similar
> accidental ones.

Yeah, there's a lot in the tablegen'd target stuff. I've attached a file with all the current -Wglobal-constructors violations in LLVM+Clang as of r150077. I haven't categorized the different kinds of violations (separating opt from other things, etc) but "grep warning: errors.log | sort | uniq" gives a general idea of where the violations are.
Comment 5 Benjamin Kramer 2012-02-08 14:00:10 PST
Most of the tblgen'd madness went away during the MCTargetDesc refactoring. The remaining bit are the TargetRegisterClasses and the MVT arrays that accompany them.

The meat of it is already migrated to MC, but TargetRegisterClass itself is tricky because it contains virtual methods. Jakob wanted to fix this eventually, but I don't know if there's any short term plan.
Comment 6 Chris Lattner 2012-02-08 16:31:54 PST
The way to handle the virtual method issue is to have tblgen generate a POD struct, and then have the "virtual classes that the code generator actually forwards to" just contain the instance that tblgen poops out.
Comment 7 Chris Lattner 2012-02-08 16:37:45 PST
There are two pieces to the cl::opt refactor that I can think of.  The first (easier) half of the issue it that cl::opt<string> (and cl::list) all contain an instance of a non-pod type.  A straight-forward way to fix the string case is to rework all the cl::opt stuff so that ParseCommandLineOptions makes a copy of "argv" in a BumpPointerAllocator (which is then released at llvm_shutdown time) and each cl::opt contains a StringRef that points back to these strings.  Each time the "get" method is called on cl::opt<int>, it would do an "atop" to reparse the StringRef.  cl::list would be handled in a similar way, where it would actually contain an ArrayRef<StringRef> and the array data for the list of entries is created as ParseCommandLineOptions time, and owned by the same BumpPointerAllocator.
Comment 8 Chris Lattner 2012-02-08 16:41:35 PST
The second half of the cl::opt problem (and the more significant one) is the "static constructor to register it" problem.  This dovetails (tangentially) with the problem that cl::opts are in a global namespace and that they can collide.  It would be "really great" if cl::opts in the X86 target where in some x86-specific namespace and could not collide with ppc target options, for example.  Similarly for mid-level optimizer passes: the loop unroll pass should just have a "threshold" cl::opt, and it would be accessible as "-loop-unroll-threshold" and the inliner would have a "threshold" cl::opt which would be accessible as "-inline-threshold".

The best way that I can think of to handle this and the registration problem together is to change cl::opt into a macro that expands into the existing global variables, as well as an initialization function for each.  This initialization function would then be called by the existing pass initialization logic and would take the pass name to scope the option with.  This way, the cl::opts would be initialized iff the passes and targets are initialized.
Comment 9 Chris Lattner 2012-02-08 16:44:49 PST
Looking at David's list (thanks!) I'd prioritize fixing these classes of issues (since they are auto generated causing us to have a LOT of them):

build/lib/Target/X86/X86GenRegisterInfo.inc:3113:20: warning: declaration requires a global constructor [-Wglobal-constructors]
  static const EVT GR32_NOAX_and_GR32_NOREXVTs[] = {

This should be an easy fix to use MVT::SimpleValueType.  These should never contain unusual types that would require EVT.


build/lib/Target/X86/X86GenRegisterInfo.inc:3275:12: warning: declaration requires a global constructor [-Wglobal-constructors]
  GR8Class      GR8RegClass;
                ^~~~~~~~~~~

This can be fixed by splitting the auto generated portion out into a separate POD type generated by tblgen.


It's worth mentioning that I don't actually care about gtest static ctors, or static ctors in llvm/tools, since they don't affect clients linking to llvm libraries.
Comment 10 David Blaikie 2012-02-08 17:04:36 PST
Thanks for all your thoughts/notes (& any more you provide), Chris - I'll certainly take a look at the cases/approaches you've suggested (they haven't quite distilled in my head yet, but I'm getting there)

> It's worth mentioning that I don't actually care about gtest static ctors, or
> static ctors in llvm/tools, since they don't affect clients linking to llvm
> libraries.

Right - we'll figure out how to turn on flags for only the right projects when we reach that point. I'm not such a purist as to think it best to remove them rather than bend the build system to our whim.
Comment 11 Benjamin Kramer 2012-02-09 06:55:55 PST
(In reply to comment #9)
> This should be an easy fix to use MVT::SimpleValueType.  These should never
> contain unusual types that would require EVT.

Done in r150173. The VT tables have hardly any users so this was surprisingly simple.

> build/lib/Target/X86/X86GenRegisterInfo.inc:3275:12: warning: declaration
> requires a global constructor [-Wglobal-constructors]
>   GR8Class      GR8RegClass;
>                 ^~~~~~~~~~~
> 
> This can be fixed by splitting the auto generated portion out into a separate
> POD type generated by tblgen.

Most of the stuff is pImpl'd into MCRegisterClass which is POD. TargetRegisterClass itself contains a virtual method (getRawAllocationOrder) which gets overriden by TableGen'd code. I don't see any way to do this via value initialization. The virtual method has to be removed before this can be fixed.

Another thing I noticed is that TargetRegisterClass also has a virtual destructor, presumably to silence a warning that GCC emits. The destructor should be made protected & non-virtual (the warning was fixed to accommodate this in GCC 4.3 or 4.4) so we don't have static dtors.
Comment 12 Benjamin Kramer 2012-02-09 08:29:59 PST
WRT virtual dtors: LLVM is smart enough to devirtualize them here and with r150174 globalopt properly removes the calls to it, so it's not a big deal, just one extra pointer in the vtable
Comment 13 Chris Lattner 2012-02-09 12:35:31 PST
Jakob, do you have any suggestions on how we can eliminate the regclass static ctors?
Comment 14 David Blaikie 2012-02-09 13:28:07 PST
Created attachment 8028 [details]
Current -Wglobal-constructors violations

Here's the updated violations 2196 down from 2383 (-8%).

I'm still not sure I understand the TableGen code enough to track down/fix issues there, but I'll keep poking at whatever I can get my teeth into.
Comment 15 Jakob Stoklund Olesen 2012-02-09 13:35:08 PST
First of all, please make sure you are not chasing windmills here. With a hot buffer cache, a noop llc invocation runs in 0.5 ms, including static constructors and all those pesky data relocations. I do see the problem when linking LLVM statically into a larger application and more code needs to be paged in, though.

Some of these data structures are very hot, so be careful to not regress LLVM performance by adding extra indirections. 

That said, I would be perfectly happy if TargetRegisterClass didn't have a type list at all. Types have no business in codegen after isel. We only have a couple of places that use the list, and performance isn't critical. A slightly more expensive interface based on a SimpleTypeValue list would be fine.

We only subclass TargetRegisterClass to provide specializations of getRawAllocationOrder(), and many classes simply use the default implementation. I don't plan to add new virtual functions, so we could replace the vtable pointer with a function pointer:

class TargetRegisterClass {
  ArrayRef<unsigned> (*OrderFunc)(const MachineFunction&);
  ...

  ArrayRef<unsigned> getRawAllocationOrder(const MachineFunction &MF) const {
    return OrderFunc ? OrderFunc(MF) : makeArrayRef(begin(), getNumRegs());
  }
}

Then subclassing is no longer necessary, and TargetRegisterClass can be non-polymorphic and list-initialized.
Comment 16 Benjamin Kramer 2012-03-01 07:39:18 PST
TargetRegisterClasses are now value-initialized (r151806).
Comment 17 jonathan.sauer 2014-03-04 02:31:49 PST
I wonder how much of these static constructors could be removed by the switch to C++11 and constexpr: Change <const> to <constexpr> and make the constructor <constexpr> as well, and initialization happens at compile-time. Most likely this would fix tablegen issues such as those mentioned in comment 9.