Hannan consistency in on-line learning in case of unbounded losses under partial monitoring

Authors:
Chamy Allenberg;Peter Auer;László Györfi;György Ottucsák
Affiliations:
School of Computer Science, Tel Aviv University, Tel Aviv, Israel;Chair for Information Technology, University of Leoben, Leoben, Austria;Department of Computer Science and Information Theory, Budapest University of Technology and Economics, Budapest, Hungary;Department of Computer Science and Information Theory, Budapest University of Technology and Economics, Budapest, Hungary
Venue:
ALT'06 Proceedings of the 17th international conference on Algorithmic Learning Theory
Year:
2006

Citing 9
Cited 6

Aggregating strategies

COLT '90 Proceedings of the third annual workshop on Computational learning theory
The weighted majority algorithm

Information and Computation
How to use expert advice

Journal of the ACM (JACM)
Gambling in a rigged casino: The adversarial multi-armed bandit problem

FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
Prediction, Learning, and Games

Prediction, Learning, and Games
Adaptive Routing Using Expert Advice

The Computer Journal
Defensive universal learning with experts

ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
Improved second-order bounds for prediction with expert advice

COLT'05 Proceedings of the 18th annual conference on Learning Theory
Minimizing regret with label efficient prediction

IEEE Transactions on Information Theory

The follow perturbed leader algorithm protected from unbounded one-step losses

ALT'09 Proceedings of the 20th international conference on Algorithmic learning theory
Learning volatility of discrete time series using prediction with expert advice

SAGA'09 Proceedings of the 5th international conference on Stochastic algorithms: foundations and applications
Algorithm selection as a bandit problem with unbounded losses

LION'10 Proceedings of the 4th international conference on Learning and intelligent optimization
Regret Bounds and Minimax Policies under Partial Monitoring

The Journal of Machine Learning Research
Online Learning in Case of Unbounded Losses Using Follow the Perturbed Leader Algorithm

The Journal of Machine Learning Research
Algorithm portfolio selection as a bandit problem with unbounded losses

Annals of Mathematics and Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper the sequential prediction problem with expert advice is considered when the loss is unbounded under partial monitoring scenarios. We deal with a wide class of the partial monitoring problems: the combination of the label efficient and multi-armed bandit problem, that is, where the algorithm is only informed about the performance of the chosen expert with probability ε≤1. For bounded losses an algorithm is given whose expected regret scales with the square root of the loss of the best expert. For unbounded losses we prove that Hannan consistency can be achieved, depending on the growth rate of the average squared losses of the experts.