Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
Efficient string matching: an aid to bibliographic search
Communications of the ACM
An Improved Pattern Matching Algorithm for Strings in Terms of Straight-Line Programs
CPM '97 Proceedings of the 8th Annual Symposium on Combinatorial Pattern Matching
Application of Lempel--Ziv factorization to the approximation of grammar-based compression
Theoretical Computer Science
New text indexing functionalities of the compressed suffix arrays
Journal of Algorithms
Approximate string matching using compressed suffix arrays
Theoretical Computer Science
ACM Computing Surveys (CSUR)
Dynamic text and static pattern matching
ACM Transactions on Algorithms (TALG)
Processing compressed texts: a tractability border
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Multi-pattern matching with bidirectional indexes
Journal of Discrete Algorithms
Efficient indexing techniques for record matching and deduplication
International Journal of Computational Vision and Robotics
Hi-index | 0.00 |
If we want to search sequentially for occurrences of many patterns in a given text, then we can apply any of dozens of multi-pattern matching algorithms in the literature. As far as we know, however, no one has said what to do if we are given a compressed self-index for the text instead of the text itself. In this paper we show how to take advantage of similarities between the patterns to speed up searches in an index. For example, we show how to store a string S [1..n] in nHk (S)+o (n (Hk (S)+1)) bits such that, given the LZ77 parse of the concatenation of t patterns of total length ℓ and maximum individual length m, we can count the occurrences of each pattern in a total of O((z + t) log ℓ log m log1 + ε n) time, where z is the number of phrases in the parse.