Maximum entropy as a special case of the minimum description length criterion
IEEE Transactions on Information Theory
Elements of information theory
Elements of information theory
An introduction to Kolmogorov complexity and its applications (2nd ed.)
An introduction to Kolmogorov complexity and its applications (2nd ed.)
Additive models, boosting, and inference for generalized divergences
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Stochastic Complexity in Statistical Inquiry Theory
Stochastic Complexity in Statistical Inquiry Theory
Maximum Entropy and the Glasses You are Looking Through
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Relative loss bounds for on-line density estimation with the exponential family of distributions
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
The minimum description length principle in coding and modeling
IEEE Transactions on Information Theory
Strong optimality of the normalized ML models as universal codes and information in data
IEEE Transactions on Information Theory
A strong version of the redundancy-capacity theorem of universal coding
IEEE Transactions on Information Theory
Proceedings of the Sixth Annual Workshop on Cyber Security and Information Intelligence Research
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
We give a characterization of Maximum Entropy/Minimum Relative Entropy inference by providingt wo 'strongen tropy concentration' theorems. These theorems unify and generalize Jaynes' 'concentration phenomenon' and Van Campenhout and Cover's 'conditional limit theorem'. The theorems characterize exactly in what sense a 'prior' distribution Q conditioned on a given constraint and the distribution P minimizing D(P||Q) over all P satisfying the constraint are 'close' to each other. We show how our theorems are related to 'universal models' for exponential families, thereby establishinga link with Rissanen's MDL/stochastic complexity. We then apply our theorems to establish the relationship (A) between entropy concentration and a game-theoretic characterization of Maximum Entropy Inference due to Topsøe and others; (B) between maximum entropy distributions and sequences that are random (in the sense of Martin-Löf/Kolmogorov) with respect to the given constraint. These two applications have strong implications for the use of Maximum Entropy distributions in sequential prediction tasks, both for the logarithmic loss and for general loss functions. We identify circumstances under which Maximum Entropy predictions are almost optimal.