COLT '90 Proceedings of the third annual workshop on Computational learning theory
The weighted majority algorithm
Information and Computation
A game of prediction with expert advice
COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Journal of the ACM (JACM)
Derandomizing stochastic prediction strategies
COLT '97 Proceedings of the tenth annual conference on Computational learning theory
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Machine Learning - Special issue on context sensitivity and concept drift
Machine Learning - Special issue on context sensitivity and concept drift
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Adaptive and Self-Confident On-Line Learning Algorithms
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Coding for a binary independent piecewise-identically-distributed source
IEEE Transactions on Information Theory - Part 2
Sequential prediction of individual sequences under general loss functions
IEEE Transactions on Information Theory
Low-complexity sequential lossless coding for piecewise-stationary memoryless sources
IEEE Transactions on Information Theory
Hi-index | 0.00 |
In this paper, we examine on-line learning problems in which the target concept is allowed to change over time. In each trial a master algorithm receives predictions from a large set of n experts. Its goal is to predict almost as well as the best sequence of such experts chosen off-line by partitioning the training sequence into k+1 sections and then choosing the best expert for each section. We build on methods developed by Herbster and Warmuth and consider an open problem posed by Freund where the experts in the best partition are from a small pool of size m. Since k ≫ m the best expert shifts back and forth between the experts of the small pool. We propose algorithms that solve this open problem by mixing the past posteriors maintained by the master algorithm. We relate the number of bits needed for encoding the best partition to the loss bounds of the algorithms. Instead of paying log n for choosing the best expert in each section we first pay log (n/m) bits in the bounds for identifying the pool of m experts and then logm bits per new section. In the bounds we also pay twice for encoding the boundaries of the sections.