String matching in Lempel-Ziv compressed strings
STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
Let sleeping files lie: pattern matching in Z-compressed files
SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
Approximating the smallest grammar: Kolmogorov complexity in natural models
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
LATIN '00 Proceedings of the 4th Latin American Symposium on Theoretical Informatics
Pattern Matching in Compressed Texts
Proceedings of the 15th Conference on Foundations of Software Technology and Theoretical Computer Science
Perfect Hashing for Strings: Formalization and Algorithms
CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
Collage system: a unifying framework for compressed pattern matching
Theoretical Computer Science - Selected papers in honour of Setsuo Arikawa
Application of Lempel--Ziv factorization to the approximation of grammar-based compression
Theoretical Computer Science
Efficient randomized pattern-matching algorithms
IBM Journal of Research and Development - Mathematics and computing
Linear work suffix array construction
Journal of the ACM (JACM)
Lower bounds for algebraic computation trees with integer inputs
SFCS '89 Proceedings of the 30th Annual Symposium on Foundations of Computer Science
ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming
Optimal pattern matching in LZW compressed strings
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Faster fully compressed pattern matching by recompression
ICALP'12 Proceedings of the 39th international colloquium conference on Automata, Languages, and Programming - Volume Part I
Speeding up q-gram mining on grammar-based compressed texts
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Hi-index | 0.00 |
Countless variants of the Lempel-Ziv compression are widely used in many real-life applications. This paper is concerned with a natural modification of the classical pattern matching problem inspired by the popularity of such compression methods: given an uncompressed pattern p[1 .. m] and a Lempel-Ziv representation of a string t[1 .. N], does p occur in t? Farach and Thorup [5] gave a randomized O(nlog2 N/n +m) time solution for this problem, where n is the size of the compressed representation of t. Building on the methods of [3] and [6], we improve their result by developing a faster and fully deterministic O(n log N/n +m) time algorithm with the same space complexity. Note that for highly compressible texts, log N/n might be of order n, so for such inputs the improvement is very significant. A small fragment of our method can be used to give an asymptotically optimal solution for the substring hashing problem considered by Farach and Muthukrishnan [4].