Most sequences are stochastic

  • Authors:
  • V. V. V'yugin

  • Affiliations:
  • Russian Academy of Sciences, Moscow, Russia, and Univ. of London, Surrey, England

  • Venue:
  • Information and Computation
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

The central problem in machine learning (and stastistics) is the problem of predicting future events xn+1 based on past observations x1x2...xn, where n = 1, 2, .... The main goal is to find a method of prediction that minimizes the total loss suffered on a sequence x1x2...xn+1 for n = 1,2 .... We say that a data sequence is stochastic if there exists a simply described prediction algorithm whose performance is close to the best possible one. This optimal performance is defined in terms of Bovk's predictive complexity, which is a generalization of the notion of Kolmogorov complexity. Predictive complexity gives a limit on the predictive performance of simply described prediction algorithms. In this paper we argue that data sequences normally occurring in the real world are stochastic; more formally, we prove that Levin's a priori semimeasure of nonstochastic sequences is small. Copyright 2001 Academic Press.