On the implementation of minimum-redundancy prefix codes

Authors:
A. Moffat;A. Turpin
Affiliations:
-;-
Venue:
DCC '96 Proceedings of the Conference on Data Compression
Year:
1996

Citing 0
Cited 1

An efficient compression code for text databases

ECIR'03 Proceedings of the 25th European conference on IR research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Minimum-redundancy coding (also known as Huffman (1952) coding) is one of the enduring techniques of data compression. We examine how best minimum-redundancy coding can be implemented, with particular emphasis on the situation when n is large, perhaps of the order of 10/sup 6/. We review techniques for devising minimum-redundancy codes, and consider in detail how encoding and decoding should be accomplished. In particular, we describe a modified decoding method that allows improved decoding throughput, requiring just a few machine operations per output symbol (rather than for each decoded bit), and uses just a few hundred bytes of memory above and beyond the space required to store an enumeration of the source alphabet. We review methods for calculating codeword lengths, show how those codeword lengths should be used to derive a minimum-redundancy code that has the alphabetic sequence property, and describes a memory-compact method for decoding such canonical codes. An improved method for decoding canonical codes is also presented.