Tunstall code, Khodak variations, and random walks

  • Authors:
  • Michael Drmota;Yuriy A. Reznik;Wojciech Szpankowski

  • Affiliations:
  • Institute Discrete Mathematics and Geometry, TU Wien, Wien, Austria;Qualcomm Inc., San Diego, CA;Department of Computer Science, Purdue University, West Lafayette, IN

  • Venue:
  • IEEE Transactions on Information Theory
  • Year:
  • 2010

Quantified Score

Hi-index 754.84

Visualization

Abstract

A variable-to-fixed length encoder partitions the source string into variable-length phrases that belong to a given and fixed dictionary. Tunstall, and independently Khodak, designed variable-to-fixed length codes for memoryless sources that are optimal under certain constraints. In this paper, we study the Tunstall and Khodak codes using variety of techniques ranging from stopping times for sums of independent random variables to Tauberian theorems and Mellin transform. After proposing an algebraic characterization of the Tunstall and Khodak codes, we present new results on the variance and a central limit theorem for dictionary phrase lengths. This analysis also provides a new argument for obtaining asymptotic results about the mean dictionary phrase length and average redundancy rates.