The empirical distribution of rate-constrained source codes

  • Authors:
  • T. Weissman;E. Ordentlich

  • Affiliations:
  • Electr. Eng. Dept., Stanford Univ., CA, USA;-

  • Venue:
  • IEEE Transactions on Information Theory
  • Year:
  • 2005

Quantified Score

Hi-index 754.96

Visualization

Abstract

Let X = (X1,...) be a stationary ergodic finite-alphabet source, Xn denote its first n symbols, and Yn be the codeword assigned to Xn by a lossy source code. The empirical kth-order joint distribution Qˆk[Xn,Yn┐(xk,yk) is defined as the frequency of appearances of pairs of k-strings (xk,yk) along the pair (Xn,Yn). Our main interest is in the sample behavior of this (random) distribution. Letting I(Qk) denote the mutual information I(Xk;Yk) when (Xk,Yk)∼Qk we show that for any (sequence of) lossy source code(s) of rate ≤R lim supn→∞(1/k)I(Qˆk[Xn,Yn┘) ≤R+(1/k)H (X1k)-H~(X) a.s. where H~(X) denotes the entropy rate of X. This is shown to imply, for a large class of sources including all independent and identically distributed (i.i.d.). sources and all sources satisfying the Shannon lower bound with equality, that for any sequence of codes which is good in the sense of asymptotically attaining a point on the rate distortion curve Qˆk[Xn,Yn┘⇒dP(Xk,Y~k) a.s. whenever P(Xk,Yk) is the unique distribution attaining the minimum in the definition of the kth-order rate distortion function. Consequences of these results include a new proof of Kieffer's sample converse to lossy source coding, as well as performance bounds for compression-based denoisers.