COLT '90 Proceedings of the third annual workshop on Computational learning theory
The weighted majority algorithm
Information and Computation
Journal of the ACM (JACM)
Gambling in a rigged casino: The adversarial multi-armed bandit problem
FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
Prediction, Learning, and Games
Prediction, Learning, and Games
Adaptive Routing Using Expert Advice
The Computer Journal
Defensive universal learning with experts
ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
Improved second-order bounds for prediction with expert advice
COLT'05 Proceedings of the 18th annual conference on Learning Theory
Minimizing regret with label efficient prediction
IEEE Transactions on Information Theory
The follow perturbed leader algorithm protected from unbounded one-step losses
ALT'09 Proceedings of the 20th international conference on Algorithmic learning theory
Learning volatility of discrete time series using prediction with expert advice
SAGA'09 Proceedings of the 5th international conference on Stochastic algorithms: foundations and applications
Algorithm selection as a bandit problem with unbounded losses
LION'10 Proceedings of the 4th international conference on Learning and intelligent optimization
Regret Bounds and Minimax Policies under Partial Monitoring
The Journal of Machine Learning Research
Online Learning in Case of Unbounded Losses Using Follow the Perturbed Leader Algorithm
The Journal of Machine Learning Research
Algorithm portfolio selection as a bandit problem with unbounded losses
Annals of Mathematics and Artificial Intelligence
Hi-index | 0.00 |
In this paper the sequential prediction problem with expert advice is considered when the loss is unbounded under partial monitoring scenarios. We deal with a wide class of the partial monitoring problems: the combination of the label efficient and multi-armed bandit problem, that is, where the algorithm is only informed about the performance of the chosen expert with probability ε≤1. For bounded losses an algorithm is given whose expected regret scales with the square root of the loss of the best expert. For unbounded losses we prove that Hannan consistency can be achieved, depending on the growth rate of the average squared losses of the experts.