Finding level-ancestors in trees
Journal of Computer and System Sciences
Improved dynamic dictionary matching
Information and Computation
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
Fast Incremental Planarity Testing
ICALP '92 Proceedings of the 19th International Colloquium on Automata, Languages and Programming
Collage system: a unifying framework for compressed pattern matching
Theoretical Computer Science - Selected papers in honour of Setsuo Arikawa
Offline Dictionary-Based Compression
DCC '99 Proceedings of the Conference on Data Compression
Optimal suffix tree construction with large alphabets
FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
Application of Lempel--Ziv factorization to the approximation of grammar-based compression
Theoretical Computer Science
A Subquadratic Sequence Alignment Algorithm for Unrestricted Scoring Matrices
SIAM Journal on Computing
The level ancestor problem simplified
Theoretical Computer Science - Latin American theorotical informatics
Linear pattern matching algorithms
SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
Improved approximate string matching and regular expression matching on Ziv-Lempel compressed texts
ACM Transactions on Algorithms (TALG)
Compressed dynamic tries with applications to LZ-compression in sublinear time and space
FSTTCS'07 Proceedings of the 27th international conference on Foundations of software technology and theoretical computer science
A faster algorithm for the computation of string convolutions using LZ78 parsing
Information Processing Letters
Fast q-gram mining on SLP compressed strings
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Image classification via LZ78 based string kernel: a comparative study
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Optimal pattern matching in LZW compressed strings
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Random access to grammar-compressed strings
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
A universal algorithm for sequential data compression
IEEE Transactions on Information Theory
Compression of individual sequences via variable-rate coding
IEEE Transactions on Information Theory
IEEE Transactions on Information Theory
Faster fully compressed pattern matching by recompression
ICALP'12 Proceedings of the 39th international colloquium conference on Automata, Languages, and Programming - Volume Part I
Speeding up q-gram mining on grammar-based compressed texts
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
Hi-index | 0.00 |
We present an efficient algorithm for computing the LZ78 factorization of a text, where the text is represented as a straight line program (SLP), which is a context free grammar in the Chomsky normal form that generates a single string. Given an SLP of size n representing a text S of length N, our algorithm computes the LZ78 factorization of T in $O(n\sqrt{N}+m\log N)$ time and $O(n\sqrt{N}+m)$ space, where m is the number of resulting LZ78 factors. We also show how to improve the algorithm so that the $n\sqrt{N}$ term in the time and space complexities becomes either nL, where L is the length of the longest LZ78 factor, or (N−α) where α≥0 is a quantity which depends on the amount of redundancy that the SLP captures with respect to substrings of S of a certain length. Since m=O(N/logσN) where σ is the alphabet size, the latter is asymptotically at least as fast as a linear time algorithm which runs on the uncompressed string when σ is constant, and can be more efficient when the text is compressible, i.e. when m and n are small.