Applications of Kolmogorov Complexity and Universal Codes to Nonparametric Estimation of Characteristics of Time Series

  • Authors:
  • Boris Ryabko

  • Affiliations:
  • Siberian State University of Telecommunications and Informatics, Institute of Computational Technologies of Siberian Branch of Russian Academy of Science Kirov Street, 86, 630102, Novosibirsk, Rus ...

  • Venue:
  • Fundamenta Informaticae
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider finite-alphabet and real-valued time series and the following four problems: i) estimation of the (limiting) probability P(x$_0$ … x$_s$) for every s and each sequence x$_0$ … x$_s$ of letters from the process alphabet (or estimation of the density p(x$_0$, …, x$_s$) for real-valued time series), ii) the so-called on-line prediction, where the conditional probability P(x$_{t+1}$∣x$_1$x$_2$ … x$_t$) (or the conditional density P(x$_{t+1}$∣x$_1$x$_2$ … x$_t$)) should be estimated, where x$_1$x$_2$ … x$_t$ are given, iii) regression and iv) classification (or so-called problems with side information). We show that Kolmogorov complexity (KC) and universal codes (or universal data compressors), whose codeword length can be considered as an estimation of KC, can be used as a basis for constructing asymptotically optimal methods for the above problems. (By definition, a universal code can "compress" any sequence generated by a stationary and ergodic source asymptotically to the Shannon entropy of the source.)