Parallel algorithms for data compression
Journal of the ACM (JACM)
Data compression: methods and theory
Data compression: methods and theory
Text compression
Asymptotic behavior of the Lempel-Ziv parsing scheme and digital search trees
Theoretical Computer Science - Special volume on mathematical analysis of algorithms (dedicated to D. E. Knuth)
On the entropy of DNA: algorithms and measurements based on memory and rapid convergence
Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms
On the optimality of parsing in dynamic dictionary based data compression
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
Linear Algorithm for Data Compression via String Matching
Journal of the ACM (JACM)
Redundancy of the Lempel-Ziv-Welch Code
DCC '97 Proceedings of the Conference on Data Compression
The effect of non-greedy parsing in Ziv-Lempel compression methods
DCC '95 Proceedings of the Conference on Data Compression
Efficient randomized pattern-matching algorithms
IBM Journal of Research and Development - Mathematics and computing
Average profile and limiting distribution for a phrase size in the Lempel-Ziv parsing algorithm
IEEE Transactions on Information Theory
Dictionary-symbolwise flexible parsing
IWOCA'10 Proceedings of the 21st international conference on Combinatorial algorithms
Dictionary-symbolwise flexible parsing
Journal of Discrete Algorithms
Hi-index | 0.00 |
We report on the performance evaluation of greedy parsing with asingle step lookahead (which we call flexible Parsing or FPas an alternative to the commonly used greedy parsing (withno-lookaheads) scheme. Greedy parsing is the basis of most popularcompression programs including UNIX compress andgzip, however it usually results in far from optimalparsing/compression with regard to the dictionary constructionscheme in use. Flexible parsing, however, is optimal [MS99], i.e.partitions any given input to the smallest number of phrasespossible, for dictionary construction schemes which satisfy theprefix property throughout their execution.We focus on the application of FP in the context of theLZW variant of the Lempel-Ziv'78 dictionary construction method[Wel84, ZL78], which is of considerable practical interest. Weimplement two compression algorithms which use (1) FP withLZW dictionary (LZW-FP), and (2) FP with analternative flexible dictionary (FPA as introduced in [Hor95]). Ourimplementations are based on novel on-line data structures enablingus to use linear time and space. We test our implementations on acollection of input sequences which includes textual files, DNAsequences, medical images, and pseudorandom binary files, andcompare our results with two of the most popular compressionprograms UNIX compress and gzip. Our resultsdemonstrate that flexible parsing is especially useful fornon-textual data, on which it improves over the compression ratesof compress and gzip by up to 20% and 35%,respectively.