Elements of information theory
Elements of information theory
Predicting a binary sequence almost as well as the optimal biased coin
COLT '96 Proceedings of the ninth annual conference on Computational learning theory
Stochastic Complexity in Statistical Inquiry Theory
Stochastic Complexity in Statistical Inquiry Theory
Worst-Case Bounds for the Logarithmic Loss of Predictors
Machine Learning
The minimum description length principle in coding and modeling
IEEE Transactions on Information Theory
Iterated logarithmic expansions of the pathwise code lengths for exponential families
IEEE Transactions on Information Theory
Hi-index | 0.00 |
We analyze the Dawid-Rissanen prequential maximum likelihood codes relative to one-parameter exponential family models ${\mathcal M}$. If data are i.i.d. according to an (essentially) arbitraryP, then the redundancy grows at rate ${\frac{1}{2}} {\rm c} {\rm ln} n$. We show that c = σ$_{\rm 1}^{\rm 2}$/ σ$_{\rm 2}^{\rm 2}$, where σ$_{\rm 1}^{\rm 2}$ is the variance of P, and σ$_{\rm 2}^{\rm 2}$ is the variance of the distribution $M^{*} \in {\mathcal M}$ that is closest to P in KL divergence. This shows that prequential codes behave quite differently from other important universal codes such as the 2-part MDL, Shtarkov and Bayes codes, for which c = 1. This behavior is undesirable in an MDL model selection setting.