Asymptotic average redundancy of Huffman (and other) block codes

  • Authors:
  • W. Szpankowski

  • Affiliations:
  • Dept. of Comput. Sci., Purdue Univ., West Lafayette, IN

  • Venue:
  • IEEE Transactions on Information Theory
  • Year:
  • 2006

Quantified Score

Hi-index 754.96

Visualization

Abstract

We study asymptotically the redundancy of Huffman (and other) codes. It has been known from the inception of the Huffman (1952) code that in the worst case its redundancy-defined as the excess of the code length over the optimal (ideal) code length-is not more than one. However, to the best of our knowledge no precise asymptotic results have been reported in literature thus far. We consider here a memoryless binary source generating a sequence of length n distributed as binomial (n, p) with p being the probability of emitting 0. Based on the results of Stubley (1994), we prove that for p<1/2 the average redundancy R¯nH of the Huffman code becomes as n→∞: R¯nH={(3/2-(1/ln2+o(1))=0.057304…, α irrational); (3/2-(1/M)(〈βMn〉-½)); (-(1/M(1-2-1M/))2-〈nβM〉M/); (+O(ρn), α=N/M rational); where α=log2 (1-p)/p and β=-log2(1-p), ρ<1, M, N are integers such that gcd (N, M)=1, and 〈x〉=x-[x] is the fractional part of x. The appearance of the fractal-like function 〈βMn〉 explains the erratic behavior of the Huffman redundancy, and its “resistance” to succumb to a precise analysis. As a side result, we prove that the average redundancy of the Shannon block code is as n→∞: R¯nS{(½+o(1), α irrational); (½-1/M (〈Mnβ〉-½)); (+O(ρn), α=N/M rational); where ρ<1. Finally, we derive the redundancy of the Golomb (1966) code (for the geometric distribution) which can be viewed as a special case of the Huffman and Shannon codes, Golomb's code redundancy has only oscillating behavior (i.e., there is not convergent mode). These findings are obtained by analytic methods such as theory of distribution of sequences modulo 1 and Fourier series