MDL convergence speed for Bernoulli sequences

Authors:
Jan Poland;Marcus Hutter
Affiliations:
Graduate School of Information Science and Technology, Hokkaido University, Japan;IDSIA, Manno (Lugano), Switzerland CH-6928
Venue:
Statistics and Computing
Year:
2006

Citing 9
Cited 4

An introduction to Kolmogorov complexity and its applications (2nd ed.)

An introduction to Kolmogorov complexity and its applications (2nd ed.)
Learning about the parameter of the Bernoulli model

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Estimation of mixture models

Estimation of mixture models
Optimality of universal Bayesian sequence prediction for general loss and alphabet

The Journal of Machine Learning Research
Sequential predictions based on algorithmic complexity

Journal of Computer and System Sciences
Fisher information and stochastic complexity

IEEE Transactions on Information Theory
The minimum description length principle in coding and modeling

IEEE Transactions on Information Theory
Minimum description length induction, Bayesianism, and Kolmogorov complexity

IEEE Transactions on Information Theory
Convergence and loss bounds for Bayesian sequence prediction

IEEE Transactions on Information Theory

Consistency of discrete Bayesian learning

Theoretical Computer Science
On the consistency of discrete Bayesian learning

STACS'07 Proceedings of the 24th annual conference on Theoretical aspects of computer science
Activized learning: transforming passive to active with improved label complexity

The Journal of Machine Learning Research
Differences between kolmogorov complexity and solomonoff probability: consequences for AGI

AGI'12 Proceedings of the 5th international conference on Artificial General Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Minimum Description Length principle for online sequence estimation/prediction in a proper learning setup is studied. If the underlying model class is discrete, then the total expected square loss is a particularly interesting performance measure: (a) this quantity is finitely bounded, implying convergence with probability one, and (b) it additionally specifies the convergence speed. For MDL, in general one can only have loss bounds which are finite but exponentially larger than those for Bayes mixtures. We show that this is even the case if the model class contains only Bernoulli distributions. We derive a new upper bound on the prediction error for countable Bernoulli classes. This implies a small bound (comparable to the one for Bayes mixtures) for certain important model classes. We discuss the application to Machine Learning tasks such as classification and hypothesis testing, and generalization to countable classes of i.i.d. models.