Journal of the ACM (JACM)
The Nonstochastic Multiarmed Bandit Problem
SIAM Journal on Computing
Discrete Prediction Games with Arbitrary Feedback and Loss
COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Prediction, Learning, and Games
Prediction, Learning, and Games
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Regret Minimization Under Partial Monitoring
Mathematics of Operations Research
Toward a classification of finite partial-monitoring games
Theoretical Computer Science
Hi-index | 0.00 |
In a finite partial-monitoring game against Nature, the Learner repeatedly chooses one of finitely many actions, the Nature responds with one of finitely many outcomes, the Learner suffers a loss and receives feedback signal, both of which are fixed functions of the action and the outcome. The goal of the Learner is to minimize its total cumulative loss. We make progress towards classification of these games based on their minimax expected regret. Namely, we classify almost all games with two outcomes: We show that their minimax expected regret is either zero, Θ(√T), Θ(T2/3), or Θ(T) and we give a simple and efficiently computable classification of these four classes of games. Our hope is that the result can serve as a stepping stone toward classifying all finite partial-monitoring games.