Computing lempel-ziv factorization online

Authors:
Tatiana Starikovskaya
Affiliations:
Lomonosov Moscow State University, Moscow, Russia
Venue:
MFCS'12 Proceedings of the 37th international conference on Mathematical Foundations of Computer Science
Year:
2012

Citing 19
Cited 0

Transducers and repetitions

Theoretical Computer Science
Linear Algorithm for Data Compression via String Matching

Journal of the ACM (JACM)
Sparse Suffix Trees

COCOON '96 Proceedings of the Second Annual International Conference on Computing and Combinatorics
Finding Maximal Repetitions in a Word in Linear Time

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Replacing suffix trees with enhanced suffix arrays

Journal of Discrete Algorithms - SPIRE 2002
Linear time algorithms for finding and representing all the tandem repeats in a string

Journal of Computer and System Sciences
Computing Longest Previous Factor in linear time and applications

Information Processing Letters
Dynamic entropy-compressed sequences and full-text indexes

ACM Transactions on Algorithms (TALG)
Geometric Burrows-Wheeler Transform: Linking Range Searching and Text Indexing

DCC '08 Proceedings of the Data Compression Conference
A Simple Algorithm for Computing the Lempel Ziv Factorization

DCC '08 Proceedings of the Data Compression Conference
An Online Algorithm for Finding the Longest Previous Factors

ESA '08 Proceedings of the 16th annual European symposium on Algorithms
On Entropy-Compressed Text Indexing in External Memory

SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
LPF Computation Revisited

Combinatorial Algorithms
I/O-Efficient Compressed Text Indexes: From Theory to Practice

DCC '10 Proceedings of the 2010 Data Compression Conference
Lempel-Ziv factorization revisited

CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Self-indexing based on LZ77

CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Position-Restricted substring searching

LATIN'06 Proceedings of the 7th Latin American conference on Theoretical Informatics
A universal algorithm for sequential data compression

IEEE Transactions on Information Theory
Cross-Document pattern matching

CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present an algorithm which computes the Lempel-Ziv factorization of a word W of length n on an alphabet Σ of size σ online in the following sense: it reads W starting from the left, and, after reading each r=O(logσn) characters of W, updates the Lempel-Ziv factorization. The algorithm requires O(nlogσ) bits of space and O(n log2n) time. The basis of the algorithm is a sparse suffix tree combined with wavelet trees.