The current way we unique (complex) constants and types is incredibly inefficient. These should use FoldingSet instead, which will significantly speed up things like bcreading and general IR operations. Once this is done, we can migrate to an API that doesn't require a temporary vector for creating things like a StructType or a ConstantArray. -Chris
For Types, I assume you are talking about replacing the std::map in the TypeMap class in Type.cpp (std::map<ValType, PATypeHolder>). If so, there is a related change in bug 388 which would move the ValType in that std::map into the Type class itself instead of having one copy in the std::map and another in the Type (or subclass) object. If I understand the intent here and FoldingSet correctly, this isn't actually necessary with FoldingSet because the individual fields that make up the "value" can be set in the Profile method of the FoldingSetNode subclass. So, I would suggest we make the contents of each Type subclass a FoldingSetNode that contains the needed data of the Type subclass. This leads to a situation where a given Type subclass object has multiple FoldingSetNode instances (one for each class in the hierarchy from Type to the derived Type). Is there a way to make FoldingSet work in such a situation? Or, will we have to create a FoldingSetNode class hierarchy that mimics the Type class hierarchy and only include the appropriate FoldingSet node in the leaf class? Seems like a lot of work, so maybe the baby step is just to implement the FoldingSetNode classes separately (and not get the memory/malloc savings). Am I on the right track here?
You're on the right track. The only detail is that there should only be one FoldingSetNode baseclass for each type: it can only go into the folding set for the derived class (e.g. ArrayType has a folding set, but not SequentialType). -Chris
Okay, one last thing I'm unclear about. The example in FoldingSet.h has the FoldingSetNode containing the data members. Is that required or could the nested FoldingSetNode subclass access the DerivedType's members in the Profile method somehow? I'm envisioning something like: class ArrayType : public SequentialType { uint64_t NumElements; class ArraayTypeFoldingSetNode : public FoldingSetNode { void Profile(FoldingSetNodeID &ID) { ID.AddString(NumElements); ///< Not valid ... } } FSN; }; The above scenario could work if Profile was passed in an instance of its containing class. It would probably need to be templatized too. So, the above scenario can only work if FoldingSetNode can deal with calls to methods on ID with parameters that are not members of FoldingSetNode. If this isn't possible, then how do we deal with the subclass data? The Type class's data would have to be duplicated in each leaf class of the hierarchy. That sounds, yucky 'n stuff.
The actual types *would* need to derive from FoldingSetNode. Perhaps a simple explanation would help: Think of foldingset node as a map. Consider "pointertype" for example. In the case of uniquing pointertypes, you want ot map from pointee types (e.g. i32) to the pointee types if they have been created (i32*). The problem with actually having std::map<Type*, PointerType*> is that we are duplicating the information of the type: the pointee type is both in the key and in the value. In the case of a pointer type, this isn't a big deal, in the case of a struct type (where they key is an array of types) this is. The obvious refinement is to notice that uniquing really just needs a set. Consider std::set<PointerType>. Now you don't have to duplicate the key. OTOH, to do a lookup, you now need to *create* a pointertype, do the lookup, then throw the pointertype away if it already exists. The idea of foldingset is to abstract out the key. Basically it can be computed from the info you have when doing the lookup (e.g. a Type* or a vector<Type*> for pointertype or structtype) or it can be created from a foldingsetnode already in the set. Both of these are important when doing a lookup. Because it needs to compute the information about stuff in the set on the fly, it needs to be able to get at the ivars in the (e.g.) pointertype. Thus, pointertype has to derive from foldingsetnode at some level. -Chris
Okay, the obvious place to make the derivation is at Type class. I thought about making AbstractTypeUser derive from FoldingSetNode to eliminate the multiple inheritance but there are non-Type subclasses of AbstractTypeUser that would be affected. Plus, it doesn't make sense for ATU to derive from FSN. So, you okay with the multiple inheritance of ATU and FSN by Type ?
yep, multiple inheritance should be ok.
I've been toying around with making this work. It turns out that FoldingSet isn't well adapted to the type uniquing job. The problem is that we want to manipulate PATypeHolder objects via a DerivedType interface. PATypeHolder must be used in the set so that the type uniquing algorithm can work. However, we also need FoldingSet to be handling the subclasses of Type. One solution is to allow a second parameter to the FoldingSet which is the "smart pointer" class to use inthe buckets. By default this is Node* but we could set it up to use PATypeHolder for the current situation. This would allow the type uniquing to work within the FoldingSet.
I seem to have found a way around the FoldingSet issue. There are some bugs surrounding the destruction of StructType and FunctionType which have weird allocation strategies. When I get those fixed, it should be good to go.
I think I have the problem tracked down now. Its not memory allocation, but C++ with MI and virtual tables. I currently have Type inherited like this: class Type: public FoldingSetNode, public AbstractTypeUser { ... } Now, when we insert a Type* into the FoldingSet, the pointer gets changed by 4 to skip over the virtual pointer. Consequently on retrieval, the object looks ill formed. Yes, it points to a FoldingSetNode, but not to a Type*. Casting it as a Type* leads to all manner of ills. I really don't know what the correct solution is for this. I could change the interface of FoldingSet::InsertNode and FoldingSet::RemoveNode to take a void* but that seems fraught with issues for other clients. Another solution would be to (artificially) put a virtual table pointer into FoldingSetNode (yuck). Are there no other uses of FoldingSetNode with classes that have virtual tables?
Created attachment 1086 [details] Patch to implement this PR (not working) This patch implements most of the enhancement but fails because of the FoldingSet issue described in the previous comment. Until a resolution for that is found, this can't be committed. NOTE: THIS PATCH DOES NOT WORK!
Reid isn't an active participant right now.
Type uniquing is now done with densemaps, both constants and types should be uniqued with densemaps where it makes sense (simple keys) or FoldingSet otherwise (e.g. ConstantArray).
Types are now bump pointer allocated and generally efficient. The only major badness is that StructTypes and FunctionTypes uniquing key is a std::vector which is really inefficient. It would be better to use a FoldingSet-style hash or something. Constants are still really gross all around.
Created attachment 7981 [details] Patch for optimizing function and anonymous struct type uniquing.
I made it here by way of the suggestion on the Open Projects page that fixing code cleanup bugs is a good way to get your feet wet. I would like to pick up this issue. Attached is a patch that uses a FoldingSet for anonymous struct and function type uniquing. The patch passed all 'make check' regression tests. Avoiding all the 'std::vector' comparisons is clearly a win, but I wasn't sure of a way to quantify the improvement via a benchmark. Should I send the patch to llvm-commits? I am going after the constant cleanup next.
Hmmm, I'm also working on this at the same time, with a somewhat different approach. I made the Constants keys instead of values, and then use find_as() to search for a constant by value. This eliminates the need to store two copies of the data in the map.
> Hmmm, I'm also working on this at the same time, with a somewhat different > approach. Which part are you working on? I just posted a patch for cleaning up the struct and function type uniquing. I haven't posted my patch for the constant cleanup yet, but ... > I made the Constants keys instead of values, and then use find_as() > to search for a constant by value. This eliminates the need to store two copies > of the data in the map. ... I got the impression that FoldingSets can be used to solve the complex constant uniquing problem as well. Currently ConstantUniqueMaps are being used with a ValType of std::vector<Constant*>. So we have the std::vector comparison problem again. I think this can be solved by profiling the constants and complex constant type for lookup into a FoldingSet. A pointer to the ConstantClass instance will be stored in the FoldingSet. That is how I am working it at least. I will post a patch in the next day or so using that method.
I was working on the uniquing of ConstantArray, ConstantStruct, etc. I didn't use a folding set. I changed the map to a DenseMap, and got rid of MapKey. In my scheme, the map key is ConstantArray*, ConstantStruct*, etc. In other words rather than having a MapKey<Type,Array>, I just use the constant itself as the key. DenseMap.find_as() allows you to do a find() with a lookup key that has a different C++ type than the keys in the map, so long as the two types are comparable and hash to the same values. So I can lookup an array by its contents (type + ArrayRef of operands). This avoids the need to have two copies of the operand list (one for the map key and one inside the constant).
Hi guys, I think it's great that you're both working on this, except that there is now a conflict. Talin, can you take the discussion to llvmdev and lay out your thoughts and plan of attack?
Sounds good to me. I am willing to help in whatever way I can (patch review, testing, coding, ...). The patch I posted for cleaning up anonymous struct and function type uniquing is orthogonal to the constant uniquing cleanup work. Should I post that patch to llvm-dev?
I wrote up a post as requested. I also want to note that my implementation isn't done yet.
Also, I'm fine with Meador's implementation using folding set for function and struct type uniquing. So far I have only worked on ConstantStruct, ConstantArray, and ConstantVector.
Cool, please review it! :)
Created attachment 7984 [details] Patch for uniquing ConstantArray OK here's a patch implementing ConstantArray uniquing using a DenseMap. Only ConstantArray is affected by this, the other data types still use the older method. This passes "make check".
Created attachment 7991 [details] Patch for uniquing ConstantArray, ConstantStruct and ConstantVector Here's an update to the patch that uses the new map class for ConstantArray, ConstantStruct and ConstantVector. All the other constant types use the old method. I've verified that "make check" passes.
I looked at Meador's patch for uniquing of struct and function types. First, I realized that this does not overlap with what I am doing, because his patch is aimed at uniquing types, whereas mine is aimed at uniquing of constants. (Although I think he may have been working on that as well, but that's not in the patch.) My one concern about the patch is that it makes FoldingSetNode a public base class of FunctionType and StructType. It seems to me that LLVM tries really hard to avoid exposing any implementation details of types or constants. (By "really hard" I mean that (a) there's a lot of tricks in the code base that have the apparent purpose of making it impossible for external code to see anything except what is in the public API, and (b) it even tries to hide *private* methods from external viewers. That's why you have classes like LLVMContextImpl and s
Reopening. Talin, you closed this bug with no comment, but it looks like type uniquing hasn't been changed yet.
As for Meador's patch for uniquing of struct and function types, it replaces this code in FunctionType::get: - // TODO: This is brutally slow. - std::vector<Type*> Key; - Key.reserve(Params.size()+2); - Key.push_back(const_cast<Type*>(ReturnType)); for (unsigned i = 0, e = Params.size(); i != e; ++i) - Key.push_back(const_cast<Type*>(Params[i])); with: + FoldingSetNodeID ID; + ID.AddPointer(ReturnType); for (unsigned i = 0, e = Params.size(); i != e; ++i) + ID.AddPointer(Params[i]); As far as I can see the only significant difference is that instead of pushing all the data into a std::vector, we're pushing it into FoldingSetNodeID's SmallVector. This seems like a very complicated way of switching from std::vector to SmallVector. (This is another way of saying that I've never understood the point of FoldingSet!) I prefer the approach Talin took for constants, where you don't have to push all the data into any kind of vector at all.
Thanks for the review guys. I agree with Talin's FoldingSetNode public interface point and Jay's point concerning trading a std::vector for a SmallVector. In practice my patch yielded only about an 8% memory reduction in the best case and negligible reductions in time. I am going to explore applying Talin's method to function and anonymous struct type uniquing.
Created attachment 8074 [details] v2 patch for optimizing fucntion and anonymous struct type uniquing Attached is v2 patch which addresses the review feedback: 1. The implementation details are private to 'LLVMContextImpl.h'. 2. A 'find_as' type lookup method is used instead of the FoldingSetNodeID stuff from v1. I also took advantage of Talin's new Hashing work (nice work BTW!). Full 'make check' run done; no regressions. I also did some basic benchmarks. The best case for anonymous structures was a ~30% memory usage reduction. The best case for function types was a ~13% memory usage reduction. The runtime impact in both cases was negligible.
> Attached is v2 patch which addresses the review feedback: Looks good to me (not that I have any official powers of review)! Minor point: the definitions of AnonStructTypeKeyInfo and FunctionTypeKeyInfo are a bit long-winded. Do you think there's any scope for commoning them up a bit? Or could DenseMap.h provide a useful base class for XXXKeyInfo classes, or something?
They can be shortened a bit: ArrayRefs have an equality operator, and GeneralHash takes an ArrayRef, so there's no need to loop through the elements individually. Also, for GeneralHash you can chain the adds: Hash.add(ReturnType).add(Params).add(isVarArg); There are a few other places where the code could be made more concise. Other than that, I probably wouldn't try to combine them - I feel the combined result would be more complicated and abstract.
I'm not going to have time to review it, but if Jay is happy with it, then so am I. BTW, Jay, you *do* have official reviewer powers, as does anyone else with commit access.
I hereby officially review your patch, Meador. Please commit! (Or do you need someone to commit it for you? I can do that, but not for another 12 hours or so.)
Created attachment 8078 [details] v3 patch for optimizing fucntion and anonymous struct type uniquing Here is an updated patch that uses the '==' overloads that Talin pointed out. Thanks for the tip Talin. I left the 'add' calls alone. I find the non-chaining easier to read. Jay, I don't have commit rights. That will be great if you can commit for me.
I would just write: bool operator==(const KeyTy& that) const { return isPacked == that.isPacked && ETypes != that.ETypes; }
> Created attachment 8078 [details] > v3 patch for optimizing fucntion and anonymous struct type uniquing Committed in r151049 (with tabs changed to spaces). Thanks! I reckon this bug is now fixed.
nice!!
I'm a bit confused about the timing numbers that Meador reported. When he said "negligible performance impact", did he mean compared to his version, or compared to the base line? I'd be really surprised if there was no performance improvement from the base line code, it means we missed something and probably need to dig further. BTW, as I mentioned to Chris, the next bit of low-hanging fruit in this arena is the uniquing of MDNodes.
I am reopening this until the GCC 4.4 strict aliasing problem with the Hashing implementation gets worked out (http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120220/137516.html). As for the benchmarks, they are relative to the baseline. I was surprised too. I think for many cases in practice we don't hit the worst case for key comparisons (i.e. even with large vector comparisons we find inequality relatively quickly and we don't end up scanning long chains in the hash buckets). The memory improvements were very obvious. I will drill down into the runtime aspects a little more to see if anything pops out.
Hashing has been fixed so this change has be re-applyed (http://llvm.org/viewvc/llvm-project?view=rev&revision=151248). Closing. Open Projects should be updated too: http://www.llvm.org/OpenProjects.html.
> Open Projects should be updated too: http://www.llvm.org/OpenProjects.html. Done in r151253.