Data compression with finite windows
Communications of the ACM
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
On the optimality of parsing in dynamic dictionary based data compression
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Data compression via textual substitution
Journal of the ACM (JACM)
Common phrases and minimum-space text storage
Communications of the ACM
Parsing with suffix and prefix dictionaries
DCC '96 Proceedings of the Conference on Data Compression
The effect of non-greedy parsing in Ziv-Lempel compression methods
DCC '95 Proceedings of the Conference on Data Compression
Replacing suffix trees with enhanced suffix arrays
Journal of Discrete Algorithms - SPIRE 2002
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
Data Compression: The Complete Reference
Data Compression: The Complete Reference
Algorithms on Strings
Computing Longest Previous Factor in linear time and applications
Information Processing Letters
A Simple Algorithm for Computing the Lempel Ziv Factorization
DCC '08 Proceedings of the Data Compression Conference
On the bit-complexity of Lempel-Ziv compression
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Combinatorial Algorithms
Dictionary-symbolwise flexible parsing
IWOCA'10 Proceedings of the 21st international conference on Combinatorial algorithms
Quasi-distinct parsing and optimal compression methods
Theoretical Computer Science
Efficient algorithms for three variants of the LPF table
Journal of Discrete Algorithms
Dictionary-symbolwise flexible parsing
Journal of Discrete Algorithms
On the Complexity of Finite Sequences
IEEE Transactions on Information Theory
A universal algorithm for sequential data compression
IEEE Transactions on Information Theory
Compression of individual sequences via variable-rate coding
IEEE Transactions on Information Theory
Hi-index | 0.00 |
Dictionary-based compression schemes are the most commonly used data compression schemes since they appeared in the foundational paper of Ziv and Lempel in 1977, and generally referred to as LZ77. Their work is the base of Zip, gZip, 7-Zip and many other compression software utilities. Some of these compression schemes use variants of the greedy approach to parse the text into dictionary phrases; others have left the greedy approach to improve the compression ratio. Recently, two bit-optimal parsing algorithms have been presented filling the gap between theory and best practice. We present a survey on the parsing problem for dictionary-based text compression, identifying noticeable results of both a theoretical and practical nature, which have appeared in the last three decades. We follow the historical steps of the Zip scheme showing how the original optimal parsing problem of finding a parse formed by the minimum number of phrases has been replaced by the bit-optimal parsing problem where the goal is to minimise the length in bits of the encoded text.