Asymptotic minimax regret for data compression, gambling, and prediction

Authors:
Qun Xie;A. R. Barron
Affiliations:
GE Capital, Stanford, CT;-
Venue:
IEEE Transactions on Information Theory
Year:
2006

Citing 0
Cited 13

Generalized Shannon Code Minimizes the Maximal Redundancy

LATIN '02 Proceedings of the 5th Latin American Symposium on Theoretical Informatics
The Last-Step Minimax Algorithm

ALT '00 Proceedings of the 11th International Conference on Algorithmic Learning Theory
How to Achieve Minimax Expected Kullback-Leibler Distance from an Unknown Finite Distribution

ALT '02 Proceedings of the 13th International Conference on Algorithmic Learning Theory
On-Line Estimation of Hidden Markov Model Parameters

DS '00 Proceedings of the Third International Conference on Discovery Science
Predicting a binary sequence almost as well as the optimal biased coin

Information and Computation
A lower bound on compression of unknown alphabets

Theoretical Computer Science
Superior Guarantees for Sequential Prediction and Lossless Compression via Alphabet Decomposition

The Journal of Machine Learning Research
A linear-time algorithm for computing the multinomial stochastic complexity

Information Processing Letters
NML computation algorithms for tree-structured multinomial Bayesian networks

EURASIP Journal on Bioinformatics and Systems Biology
Some Sufficient Conditions on an Arbitrary Class of Stochastic Processes for the Existence of a Predictor

ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
A fast normalized maximum likelihood algorithm for multinomial data

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
MDL denoising revisited

IEEE Transactions on Signal Processing
Coding on countably infinite alphabets

IEEE Transactions on Information Theory

Quantified Score

Hi-index	754.90

Visualization

Abstract

For problems of data compression, gambling, and prediction of individual sequences x1, ···, xn the following questions arise. Given a target family of probability mass functions p(x1, ···, x n|θ), how do we choose a probability mass function q(x 1, ···, xn) so that it approximately minimizes the maximum regret/belowdisplayskip10ptminus6pt max (log1/q(x1, ···, xn)-log1/p(x1, ···, xn |θˆ)) and so that it achieves the best constant C in the asymptotics of the minimax regret, which is of the form (d/2)log(n/2π)+C+o(1), where d is the parameter dimension? Are there easily implementable strategies q that achieve those asymptotics? And how does the solution to the worst case sequence problem relate to the solution to the corresponding expectation version minq max 0 E0(log1/q(x1, ···, xn)-log1/p(x1, ···, xn|θ))? In the discrete memoryless case, with a given alphabet of size m, the Bayes procedure with the Dirichlet(1/2, ···, 1/2) prior is asymptotically maximin. Simple modifications of it are shown to be asymptotically minimax. The best constant is Cm=log(Γ(1/2)m/(Γ(m/2)) which agrees with the logarithm of the integral of the square root of the determinant of the Fisher information. Moreover, our asymptotically optimal strategies for the worst case problem are also asymptotically optimal for the expectation version. Analogous conclusions are given for the case of prediction, gambling, and compression when, for each observation, one has access to side information from an alphabet of size k. In this setting the minimax regret is shown to be k(m-1)/2logn/2πk+kCm+o(1)