Asymptotic log-loss of prequential maximum likelihood codes

Authors:
Peter Grünwald;Steven de Rooij
Affiliations:
CWI Amsterdam;CWI Amsterdam
Venue:
COLT'05 Proceedings of the 18th annual conference on Learning Theory
Year:
2005

Citing 7
Cited 0

Elements of information theory

Elements of information theory
Predicting a binary sequence almost as well as the optimal biased coin

COLT '96 Proceedings of the ninth annual conference on Computational learning theory
Stochastic Complexity in Statistical Inquiry Theory

Stochastic Complexity in Statistical Inquiry Theory
Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions

Machine Learning
Worst-Case Bounds for the Logarithmic Loss of Predictors

Machine Learning
The minimum description length principle in coding and modeling

IEEE Transactions on Information Theory
Iterated logarithmic expansions of the pathwise code lengths for exponential families

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

We analyze the Dawid-Rissanen prequential maximum likelihood codes relative to one-parameter exponential family models ${\mathcal M}$. If data are i.i.d. according to an (essentially) arbitraryP, then the redundancy grows at rate ${\frac{1}{2}} {\rm c} {\rm ln} n$. We show that c = σ$_{\rm 1}^{\rm 2}$/ σ$_{\rm 2}^{\rm 2}$, where σ$_{\rm 1}^{\rm 2}$ is the variance of P, and σ$_{\rm 2}^{\rm 2}$ is the variance of the distribution $M^{*} \in {\mathcal M}$ that is closest to P in KL divergence. This shows that prequential codes behave quite differently from other important universal codes such as the 2-part MDL, Shtarkov and Bayes codes, for which c = 1. This behavior is undesirable in an MDL model selection setting.