Randomness conservation inequalities; information and independence in mathematical theories
Information and Control
Elements of information theory
Elements of information theory
Inductive reasoning and Kolmogorov complexity
Journal of Computer and System Sciences
Universal forecasting algorithms
Information and Computation
The weighted majority algorithm
Information and Computation
Journal of the ACM (JACM)
An introduction to Kolmogorov complexity and its applications (2nd ed.)
An introduction to Kolmogorov complexity and its applications (2nd ed.)
The discovery of algorithmic probability
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Inductive Inference: Theory and Methods
ACM Computing Surveys (CSUR)
New error bounds for Solomonoff prediction
Journal of Computer and System Sciences
Stochastic Complexity in Statistical Inquiry Theory
Stochastic Complexity in Statistical Inquiry Theory
Worst-Case Bounds for the Logarithmic Loss of Predictors
Machine Learning
Adaptive and Self-Confident On-Line Learning Algorithms
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Self-Optimizing and Pareto-Optimal Policies in General Environments Based on Bayes-Mixtures
COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
The Speed Prior: A New Simplicity Measure Yielding Near-Optimal Computable Predictions
COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
No free lunch theorems for optimization
IEEE Transactions on Evolutionary Computation
Fisher information and stochastic complexity
IEEE Transactions on Information Theory
A decision-theoretic extension of stochastic complexity and its applications to learning
IEEE Transactions on Information Theory
Sequential prediction of individual sequences under general loss functions
IEEE Transactions on Information Theory
IEEE Transactions on Information Theory
Minimum description length induction, Bayesianism, and Kolmogorov complexity
IEEE Transactions on Information Theory
IEEE Transactions on Information Theory
Convergence and loss bounds for Bayesian sequence prediction
IEEE Transactions on Information Theory
MDL convergence speed for Bernoulli sequences
Statistics and Computing
Algorithmic complexity bounds on future prediction errors
Information and Computation
On the possibility of learning in reactive environments with arbitrary dependence
Theoretical Computer Science
On Universal Transfer Learning
ALT '07 Proceedings of the 18th international conference on Algorithmic Learning Theory
On universal transfer learning
Theoretical Computer Science
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
A minimum relative entropy principle for learning and acting
Journal of Artificial Intelligence Research
Asymptotic learnability of reinforcement problems with arbitrary dependence
ALT'06 Proceedings of the 17th international conference on Algorithmic Learning Theory
Monotone conditional complexity bounds on future prediction errors
ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
A brief observation-centric analysis on anomaly-based intrusion detection
ISPEC'05 Proceedings of the First international conference on Information Security Practice and Experience
Hi-index | 0.00 |
Various optimality properties of universal sequence predictors based on Bayes-mixtures in general, and Solomonoff's prediction scheme in particular, will be studied.The probability of observing xt at time t, given past observations x1...xt-1 can be computed with the chain rule if the true generating distribution μ of the sequences x1x2x3.... is known. If μ is unknown, but known to belong to a countable or continuous class Μ one can base ones prediction on the Bayes-mixture ξ defined as a wν-weighted sum or integral of distributions ν ∈ Μ. The cumulative expected loss of the Bayes-optimal universal prediction scheme based on ξ is shown to be close to the loss of the Bayes-optimal, but infeasible prediction scheme based on μ. We show that the bounds are tight and that no other predictor can lead to significantly smaller bounds.Furthermore, for various performance measures, we show Pareto-optimality of ξ and give an Occam's razor argument that the choice wν &sim 2-K(ν) for the weights is optimal, where K(ν) is the length of the shortest program describing ν.The results are applied to games of chance, defined as a sequence of bets, observations, and rewards.The prediction schemes (and bounds) are compared to the popular predictors based on expert advice.Extensions to infinite alphabets, partial, delayed and probabilistic prediction, classification, and more active systems are briefly discussed.