BIC Context Tree Estimation for Stationary Ergodic Processes

  • Authors:
  • Z. Talata;T. E. Duncan

  • Affiliations:
  • Dept. of Math., Univ. of Kansas, Lawrence, KS, USA;-

  • Venue:
  • IEEE Transactions on Information Theory
  • Year:
  • 2011

Quantified Score

Hi-index 754.84

Visualization

Abstract

Context trees of arbitrary stationary ergodic processes with finite alphabets are considered. Such a process is not necessarily a Markov chain, so the context tree may be of infinite depth. Calculated from a sample of size n, the Bayesian information criterion (BIC) is shown to provide a strongly consistent estimator of the context tree of the process, via minimization over hypothetical context trees, without any restriction on the hypothetical context trees. Strong consistency means that the estimated context tree recovers the true one up to a level K, eventually almost surely as n tends to infinity. Under some conditions on the process, it is shown that the recovery level K can grow with n at a specific rate determined by the distribution of the process; thus, the BIC estimator can recover the true context tree to larger and larger depths. The results include for the special case of K being an arbitrary constant that the strong consistency is satisfied without any assumption on the stationary ergodic process, which itself improves the existing results, where either the true context tree was assumed to be of finite depth or the depth of the hypothetical context trees was bounded by o(log n).