On universal types

  • Authors:
  • G. Seroussi

  • Affiliations:
  • Hewlett-Packard Labs., Palo Alto, CA

  • Venue:
  • IEEE Transactions on Information Theory
  • Year:
  • 2006

Quantified Score

Hi-index 754.96

Visualization

Abstract

The universal type class of a sequence xn is defined, in analogy to the notion underlying the classical method of types. Two sequences of the same length are said to be of the same universal (LZ) type if and only if they yield the same dictionary (or, equivalently, parsing tree) in the incremental parsing of Ziv and Lempel (1978). It is shown that for any finite order k, the variational distance between the kth-order empirical probability distributions of two sequences of the same universal type vanishes as the sequence length tends to infinity. Consequently, for any k and any kth-order probability assignment, the difference between the normalized logarithms of the probabilities assigned to two sequences of the same universal type also vanishes asymptotically. The size of a universal type class is studied, and it is shown that its asymptotic behavior parallels that of the conventional counterpart, with the LZ78 code length playing the role of the empirical entropy. The number of universal types for sequences of length n is estimated, and shown to be of the form exp((1+o(1))gamman/logn) for a well characterized constant gamma. Algorithms for enumerating the sequences in a universal type class, and for drawing a sequence from the class with uniform probability are described. As an application, the problem of universal simulation of individual sequences is considered. A sequence drawn with uniform probability from the universal type class of xn is an optimal simulation of xn in a well defined mathematical sense