Machine Learning - Special issue on context sensitivity and concept drift
Minimax regret under log loss for general classes of experts
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
On prediction of individual sequences relative to a set of experts in the presence of noise
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Tracking a Small Set of Experts by Mixing Past Posteriors
COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Potential-Based Algorithms in Online Prediction and Game Theory
COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Discrete Prediction Games with Arbitrary Feedback and Loss
COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Learning Additive Models Online with Fast Evaluating Kernels
COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Mixability and the Existence of Weak Complexities
COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
Tracking the best linear predictor
The Journal of Machine Learning Research
Tracking a small set of experts by mixing past posteriors
The Journal of Machine Learning Research
Optimality of universal Bayesian sequence prediction for general loss and alphabet
The Journal of Machine Learning Research
Superior Guarantees for Sequential Prediction and Lossless Compression via Alphabet Decomposition
The Journal of Machine Learning Research
Prediction With Expert Advice For The Brier Game
The Journal of Machine Learning Research
Prediction with expert evaluators' advice
ALT'09 Proceedings of the 20th international conference on Algorithmic learning theory
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Prediction with expert advice under discounted loss
ALT'10 Proceedings of the 21st international conference on Algorithmic learning theory
Relative loss bounds for on-line density estimation with the exponential family of distributions
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
A randomized online learning algorithm for better variance control
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Continuous experts and the binning algorithm
COLT'06 Proceedings of the 19th annual conference on Learning Theory
On-Line regression competitive with reproducing kernel hilbert spaces
TAMC'06 Proceedings of the Third international conference on Theory and Applications of Models of Computation
The weak aggregating algorithm and weak mixability
COLT'05 Proceedings of the 18th annual conference on Learning Theory
Sparse regression learning by aggregation and Langevin Monte-Carlo
Journal of Computer and System Sciences
Mixability is bayes risk curvature relative to log loss
The Journal of Machine Learning Research
Hi-index | 754.84 |
We consider adaptive sequential prediction of arbitrary binary sequences when the performance is evaluated using a general loss function. The goal is to predict on each individual sequence nearly as well as the best prediction strategy in a given comparison class of (possibly adaptive) prediction strategies, called experts. By using a general loss function, we generalize previous work on universal prediction, forecasting, and data compression. However, here we restrict ourselves to the case when the comparison class is finite. For a given sequence, we define the regret as the total loss on the entire sequence suffered by the adaptive sequential predictor, minus the total loss suffered by the predictor in the comparison class that performs best on that particular sequence. We show that for a large class of loss functions, the minimax regret is either θ(log N) or Ω(√Llog N), depending on the loss function, where N is the number of predictors in the comparison class andL is the length of the sequence to be predicted. The former case was shown previously by Vovk (1990); we give a simplified analysis with an explicit closed form for the constant in the minimax regret formula, and give a probabilistic argument that shows this constant is the best possible. Some weak regularity conditions are imposed on the loss function in obtaining these results. We also extend our analysis to the case of predicting arbitrary sequences that take real values in the interval [0,1]