An analysis of the Burrows—Wheeler transform
Journal of the ACM (JACM)
Improving table compression with combinatorial optimization
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Indexing Text Using the Ziv-Lempel Trie
SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
Improving table compression with combinatorial optimization
Journal of the ACM (JACM)
Indexing text using the Ziv-Lempel trie
Journal of Discrete Algorithms - SPIRE 2002
Journal of the ACM (JACM)
Boosting textual compression in optimal linear time
Journal of the ACM (JACM)
Squeezing succinct data structures into entropy bounds
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
ACM Computing Surveys (CSUR)
Ultra-succinct representation of ordered trees
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Compressing table data with column dependency
Theoretical Computer Science
A compressed self-index using a Ziv---Lempel dictionary
Information Retrieval
Implementing the LZ-index: Theory versus practice
Journal of Experimental Algorithmics (JEA)
On the bit-complexity of Lempel-Ziv compression
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
On the Value of Multiple Read/Write Streams for Data Compression
CPM '09 Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching
Practical approaches to reduce the space requirement of lempel-ziv--based compressed text indices
Journal of Experimental Algorithmics (JEA)
Spatio-temporal range searching over compressed kinetic sensor data
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part I
Space-efficient construction of Lempel-Ziv compressed text indexes
Information and Computation
Statistical encoding of succinct data structures
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Reducing the space requirement of LZ-Index
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
A compressed self-index using a ziv-lempel dictionary
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Space-efficient construction of LZ-index
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
Grammar-based compression in a streaming model
LATA'10 Proceedings of the 4th international conference on Language and Automata Theory and Applications
Optimal information measures for weakly chaotic dynamical systems
General Theory of Information Transfer and Combinatorics
Dictionary-symbolwise flexible parsing
Journal of Discrete Algorithms
CRAM: compressed random access memory
ICALP'12 Proceedings of the 39th international colloquium conference on Automata, Languages, and Programming - Volume Part I
MFCS'07 Proceedings of the 32nd international conference on Mathematical Foundations of Computer Science
A Lempel-Ziv text index on secondary storage
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Most burrows-wheeler based compressors are not optimal
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
On compressing and indexing repetitive sequences
Theoretical Computer Science
On the value of multiple read/write streams for data compression
Information Theory, Combinatorics, and Search Theory
Hi-index | 0.00 |
We compare the compression ratio of the Lempel--Ziv algorithms with the empirical entropy of the input string. This approach makes it possible to analyze the performance of these algorithms without any assumption on the input and to obtain worst case results. We show that in this setting the standard definition of optimal compression algorithm is not satisfactory. In fact, although Lempel--Ziv algorithms are optimal according to the standard definition, there exist families of low entropy strings which are not compressed optimally. More precisely, the compression ratio achieved by LZ78 (resp., LZ77) can be much higher than the zeroth order entropy H0 (resp., the first order entropy H1).For this reason we introduce the concept of $\lambda$-optimal algorithm. An algorithm is $\lambda$-optimal with respect to Hk if, loosely speaking, its compression ratio is asymptotically bounded by $\lambda$ times the kth order empirical entropy Hk. We prove that LZ78 cannot be $\lambda$-optimal with respect to any Hk with $k\geq 0$. Then, we describe a new algorithm which combines LZ78 with run length encoding (RLE) and is 3-optimal with respect to H0. Finally, we prove that LZ77 is 8-optimal with respect to H0, and that it cannot be $\lambda$-optimal with respect to Hk for any $k\geq 1$.