Parallel algorithms for data compression
Journal of the ACM (JACM)
Data compression: methods and theory
Data compression: methods and theory
An approximation algorithm for space-optimal encoding of a text
The Computer Journal
Text compression
Experiments in text file compression
Communications of the ACM
Algorithm 444: an algorithm for extracting phrases in a space-optimal fashion
Communications of the ACM
Efficient recompression techniques for dynamic full-text retrieval systems
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Worst-case analysis of the Iterated Longest Fragment algorithm
Information Processing Letters
Prediction by Grammatical Match
DCC '00 Proceedings of the Conference on Data Compression
On the bit-complexity of Lempel-Ziv compression
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Dictionary-symbolwise flexible parsing
IWOCA'10 Proceedings of the 21st international conference on Combinatorial algorithms
Dictionary-symbolwise flexible parsing
Journal of Discrete Algorithms
Hi-index | 0.01 |
Text compression is often done using a fixed, previously formed dictionary (code book) that expresses which substrings of the text can be replaced by code words. There always exists an optimal solution for text-encoding problem. Due to the long processing times of the various optimal algorithms, several heuristics have been proposed in the literature. In this paper, the worst-case compression gains obtained by the longest match and the greedy heuristics for various types of dictionaries is studied. For general dictionaries, the performance of the heuristics can be almost the weakest possible. In practice, however, the dictionaries have usually properties that lead to a space-optimal or near-space-optimal coding result with the heuristics.