-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SymbolTable] Reconsider one symbol table for each type #783
Comments
I won't miss type-sensitivity. Go for it. |
All sounds good to me. I argued for removal of the type plane when I did the |
One thing to note: bytecode changes. If this symbol table change is made, it Reid. |
I think this change can be made without breaking either .ll or .bc formats. We -Chris |
Here is some more data. When running gccld on 252.eon with -disable-opt, I see % cumulative self self total That is spending about 60% of the time in the symbol table and map operations. Ugh. -Chris |
When you say "get rid of type sensitivity", I'm assuming that you mean the I still think this needs to get done by 1.3 because of bytecode changes. Its not |
Yes, exactly. I will post some more compelling numbers that show why this is
Yes, also true. The basic goal would be to make Instruction::setName() do as
I don't think this is the case... empty blocks that are opened in the bcwriter
Why would this change?
That is already implemented. If you -strip a bytecode file, it does not write -Chris |
Here is some more exciting and compelling evidence for this bug. :) After % cumulative self self total The big consumer of std::_Rb_tree_base_iterator::_M_increment() and find seem to 0.58 4.05 0.03 7575 0.00 0.00 FindGlobalNamed gccld takes 8.5s total on eon without optimization, so this function corresponds -Chris |
Okay, I hacked around the FindGlobalNamed problem, dramatically speeding up the However, this PR still should be fixed. :) -Chris |
I mis-spoke a bit. Its not the length field that will be wasted, its the slot
So, what I'm advocating is that we eliminate 5 and 7 from the bytecode format.
Functions don't contain types. Types are all written at the global level. They |
I really don't think that this is a big deal (there are much better things for Note that this is a big risk right before the release, so unless you feel -Chris |
Going to give this one another crack. |
Cool! -Chris |
These patches implement the TypeSymbolTable and ValueSymbolTable classes, a http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20060109/030628.html |
Sabre, There's a problem with this implementation. When reading a bytecode file, I I think the correct thing to do is leave the type planes in the bytecode file Any thoughts on this? |
Sounds fine to me. This will make the writer a bit slower, but the reader fast. Sounds like a good Thanks for digging into this Reid! -Chris |
Chris, Type planes have been retained in the bytecode file (but not the symtab) and I have gotten rid of many bugs in the asmparser but am now faced with a failing FAIL: /proj/llvm/build/../llvm/test/Feature/alignment.ll:
|
Okay, the re-naming issue turned out to be a red herring. I wasn't checking for |
yeah, these sorts of tests first asm/disasm the llvm file to get a canonical output (this part should do the -Chris |
I'm now looking at the tests that use the same intrinsic with different argument
I was thinking we could just make the assembler "smart" about intrinsics and Got any other ideas? |
Another idea: fixing this bug will take a concerted effort and is a nasty one to fix. OTOH, the real typedef std::map<const std::string, Value *> ValueMap; Things would be much more efficient (both in memory-use and compile-time) if we used this instead: std::map<std::pair<const Type *, std::string>, ValueMap> ... and this should be a much easier change to make. -Chris |
I don't see how that helps. Instead of looking up the plane and then searching a How about we switch this to the google dense hash map? Reid. |
I already clarified with reid, but for the record: This is a size win, because it eliminates one std::map per type plane, a significant memory win. Using a better map is another potentially useful change, but is orthogonal to this one. This is required no -Chris |
An incremental step has been taken with patches committed today by separating Next Steps:
|
#1: excellent point -Chris |
Chris, With type planes gone the symbol table no longer will allow a global variable Reid. |
For the record, Chris answered that this would be a good thing on IRC. |
Chris, I have llvm/llvm-bugzilla-archive#411 working except for a few FunctionResolve test cases. If I Please confirm. Reid. |
yeah, ignore funcresolve, it can be removed as part of this bug. just make sure the linker does the right thing when you link a.c: b.c: The two foo's should get linked up and end up having a prototype that matches the definition. -Chris |
Right, will do. |
Chris, This linking is a bit tricky. I had assumed that LinkModules.cpp already handled
Reid. |
C doesn't require prototypes to match, but doesn't specify what should happen if they don't. The way we should implement this in LLVM is to pick one function as the designated result (either the In the example above, I'd expect the result to be something like this: bar() { (void(*)(...))(foo) (123); } void foo(int X) {} |
w00t! This is finally done with a series of commits today. |
…m#783) This PR set a proper addrspace attribute for LLVM globals. For simplicity, a temporary pointer type is created and consumed by our LLVM type converter. The correct address space is then extracted from the converted pointer type of LLVM.
…m#783) This PR set a proper addrspace attribute for LLVM globals. For simplicity, a temporary pointer type is created and consumed by our LLVM type converter. The correct address space is then extracted from the converted pointer type of LLVM.
Extended Description
LLVM currently keeps one symbol table for each type plane in the program. There
are historical reasons for this, but they aren't particularly relevant any more.
I think the symbol table should just be a simple wrapper around two maps from
std::string to Value* and Type* (a one level map, not a two level one like we have).
Here are some of the problems that the current symbol table design causes:
This is really slow in the common case. In particular, to get to an element
in the symbol table, we need to go through two levels of maps (one from type
-> ValueMap, one from ValueMap -> Value*). In particular, when gccld'ing
252.eon, the top two killers are map operations which I believe are due
directly to the symbol table.
Important operations are really slow. In particular Module::getNamedFunction
takes linear time with size of the module because we don't know what type the
named function will have. This leads to the nastiness we see in the
Module::getMainFunction() implementation.
Linking is a nightmare, and because of this nightmare, we have to have things
like the function resolution pass, which is a certified disturbing hack.
I'm beginning to think that the function scoped symbol table and the module
scoped symbol table should be different classes entirely (once simplified).
In particular, the function-scope symbol table cannot have types in it.
Another capability that we want is to be able to completely disable the
function-level symbol table in certain cases, such as release mode. Names
inside of a function have no meaning, they are just there to make debugging
easier. Why should release mode have to pay for all of those map lookups?
Something that is important to point out is that LLVM is not serving the
front-end at all by having type-specific symbol table planes at the global
scope. In particular, even a language that wants to allow overloading based on
type, for example, will have to implement name mangling itself: the LLVM types
for a particular method are not necessarily going to be the distinct in the same
cases the source-level types would be (e.g. due to structural equivalence).
So anyway, here is the proposal:
names inserted into it (setName on a Instruction, Argument, or BasicBlock
would be a noop).
Thoughts?
-Chris
The text was updated successfully, but these errors were encountered: