Efficient variants of Huffman codes in high level languages

  • Authors:
  • Y. Choueka;S. T. Klein;Y. Perl

  • Affiliations:
  • Institute for Information Retrieval and Computational Linguistics (The Responsa Project) and Department of Mathematics and Computer Science, Bar-Ilan University, Ramat Gan, Israel;Institute for Information Retrieval and Computational Linguistics (The Responsa Project) and Department of Mathematics and Computer Science, Bar-Ilan University, Ramat Gan, Israel and Department o ...;Institute for Information Retrieval and Computational Linguistics (The Responsa Project) and Department of Mathematics and Computer Science, Bar-Ilan University, Ramat Gan, Israel and Department o ...

  • Venue:
  • SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 1985

Quantified Score

Hi-index 0.00

Visualization

Abstract

Although it is well-known that Huffman Codes are optimal for text compression in a character-per-character encoding scheme, they are seldom used in practical situations since they require a bit-per-bit decoding algorithm, which has to be written in some assembly language, and will perform rather slowly. A number of methods are presented that avoid these difficulties. The decoding algorithms efficiently process the encoded string on a byte-per-byte basis, are faster than the original algorithm, and can be programmed in any high level language. This is achieved at the cost of storing some tables in the internal memory, but with no loss in the compression savings of the optimal Huffman codes. The internal memory space needed can be reduced either at the cost of increased processing time, or by using non-binary Huffman codes, which give sub-optimal compression. Experimental results for English and Hebrew text are also presented.