LLVM
13.0.0git
|
A data structure for fast substring queries. More...
#include "llvm/Support/SuffixTree.h"
Classes | |
struct | RepeatedSubstring |
A repeated substring in the tree. More... | |
struct | RepeatedSubstringIterator |
Iterator for finding all repeated substrings in the suffix tree. More... | |
Public Types | |
typedef RepeatedSubstringIterator | iterator |
Public Member Functions | |
SuffixTree (const std::vector< unsigned > &Str) | |
Construct a suffix tree from a sequence of unsigned integers. More... | |
iterator | begin () |
iterator | end () |
Public Attributes | |
llvm::ArrayRef< unsigned > | Str |
Each element is an integer representing an instruction in the module. More... | |
A data structure for fast substring queries.
Suffix trees represent the suffixes of their input strings in their leaves. A suffix tree is a type of compressed trie structure where each node represents an entire substring rather than a single character. Each leaf of the tree is a suffix.
A suffix tree can be seen as a type of state machine where each state is a substring of the full string. The tree is structured so that, for a string of length N, there are exactly N leaves in the tree. This structure allows us to quickly find repeated substrings of the input string.
In this implementation, a "string" is a vector of unsigned integers. These integers may result from hashing some data type. A suffix tree can contain 1 or many strings, which can then be queried as one large string.
The suffix tree is implemented using Ukkonen's algorithm for linear-time suffix tree construction. Ukkonen's algorithm is explained in more detail in the paper by Esko Ukkonen "On-line construction of suffix trees. The paper is available at
https://www.cs.helsinki.fi/u/ukkonen/SuffixT1withFigs.pdf
Definition at line 137 of file SuffixTree.h.
Definition at line 343 of file SuffixTree.h.
SuffixTree::SuffixTree | ( | const std::vector< unsigned > & | Str | ) |
Construct a suffix tree from a sequence of unsigned integers.
Str | The string to construct the suffix tree for. |
Definition at line 19 of file SuffixTree.cpp.
References llvm::EmptyIdx.
|
inline |
Definition at line 344 of file SuffixTree.h.
|
inline |
Definition at line 345 of file SuffixTree.h.
llvm::ArrayRef<unsigned> llvm::SuffixTree::Str |
Each element is an integer representing an instruction in the module.
Definition at line 140 of file SuffixTree.h.