Faster fully compressed pattern matching by recompression

Authors:
Artur Jeż
Affiliations:
Institute of Computer Science, University of Wrocław, Poland
Venue:
ICALP'12 Proceedings of the 39th international colloquium conference on Automata, Languages, and Programming - Volume Part I
Year:
2012

Citing 14
Cited 3

Pattern matching in dynamic texts

SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Efficient Algorithms for Lempel-Zip Encoding (Extended Abstract)

SWAT '96 Proceedings of the 5th Scandinavian Workshop on Algorithm Theory
Randomized Efficient Algorithms for Compressed Strings: The Finger-Print Approach (Extended Abstract)

CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
An Improved Pattern Matching Algorithm for Strings in Terms of Straight-Line Programs

CPM '97 Proceedings of the 8th Annual Symposium on Combinatorial Pattern Matching
Pattern Matching in Text Compressed by Using Antidictionaries

CPM '99 Proceedings of the 10th Annual Symposium on Combinatorial Pattern Matching
Testing Equivalence of Morphisms on Context-Free Languages

ESA '94 Proceedings of the Second Annual European Symposium on Algorithms
Fully Compressed Pattern Matching Algorithm for Balanced Straight-Line Programs

SPIRE '00 Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)
Application of Lempel--Ziv factorization to the approximation of grammar-based compression

Theoretical Computer Science
Compressed membership in automata with compressed labels

CSR'11 Proceedings of the 6th international conference on Computer science: theory and applications
Pattern matching in lempel-Ziv compressed strings: fast, simple, and deterministic

ESA'11 Proceedings of the 19th European conference on Algorithms
Optimal pattern matching in LZW compressed strings

Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Querying and embedding compressed texts

MFCS'06 Proceedings of the 31st international conference on Mathematical Foundations of Computer Science
Simple and efficient LZW-Compressed multiple pattern matching

CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Processing compressed texts: a tractability border

CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching

Efficient LZ78 factorization of grammar compressed text

SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Isomorphism of regular trees and words

Information and Computation
One-Variable word equations in linear time

ICALP'13 Proceedings of the 40th international conference on Automata, Languages, and Programming - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, a fully compressed pattern matching problem is studied. The compression is represented by straight-line programs (SLPs), i.e. a context-free grammar generating exactly one string; the term fully means that both the pattern and the text are given in the compressed form. The problem is approached using a recently developed technique of local recompression: the SLPs are refactored, so that substrings of the pattern and text are encoded in both SLPs in the same way. To this end, the SLPs are locally decompressed and then recompressed in a uniform way. This technique yields an $\mathcal{O}((n+m)\log M \log(n+m))$ algorithm for compressed pattern matching, where n (m) is the size of the compressed representation of the text (pattern, respectively), while M is the size of the decompressed pattern. Since M≤2m, this substantially improves the previously best $\mathcal{O}(m^2n)$ algorithm. Since LZ compression standard reduces to SLP with log( N / n) overhead and in $\mathcal{O}(n \log(N/n))$ time, the presented algorithm can be applied also to the fully LZ-compressed pattern matching problem, yielding an $\mathcal{O}(s \log s \log M)$ running time, where s=n log(N/n)+m log(M/m).