On the expected codeword length per symbol of optimal prefix codes for extended sources

  • Authors:
  • Jay Cheng

  • Affiliations:
  • Department of Electrical Engineering and the Institute of Communications Engineering, National Tsing Hua University, Hsinchu, Taiwan

  • Venue:
  • IEEE Transactions on Information Theory
  • Year:
  • 2009

Quantified Score

Hi-index 754.84

Visualization

Abstract

Given a discrete memoryless source X, it is well known that the expected codeword length per symbol Ln(X) of an optimal prefix code for the extended source Xn converges to the source entropy as n approaches infinity. However, the sequence Ln(X) need not be monotonic in n, which implies that the coding efficiency cannot be increased by simply encoding a larger block of source symbols (unless the block length is appropriately chosen). As the encoding and decoding complexity increases exponentially with the block length, from a practical perspective it is useful to know when an increase in the block length guarantees a decrease in the expected codeword length per symbol. While this paper does not provide a complete answer to that question, we give some properties of Ln(X) and obtain for each n ≥ 1 and nondyadic p1n (p1 is the probability of the most likely source symbol) an integer k* for which Lkn(X) Ln(X) for all k ≥ k*, implying that the coding efficiency of encoding blocks of length kn is higher than that of encoding blocks of length n for all k ≥ k*. This question is simpler in part because Lkn(X) ≤ Ln(X) is guaranteed for all n ≥ 1 and k ≥ 1, but our results distinguish scenarios where increasing the multiplicative factor guarantees strict improvement. These results extend and generalize those by Montgomery and Kumar.