Regret Minimization Under Partial Monitoring

Authors:
Nicolò Cesa-Bianchi;Gábor Lugosi;Gilles Stoltz
Affiliations:
Dipartimento de Scienze dell' Informazione, Università di Milano, Milano, Italy;ICREA and Department of Economics, Pompeu Fabra University, Barcelona, Spain;CNRS and Département de Mathématiques et Applications, Ecole Normale Supérieure, Paris, France
Venue:
Mathematics of Operations Research
Year:
2006

Citing 19
Cited 11

Aggregating strategies

COLT '90 Proceedings of the third annual workshop on Computational learning theory
Elements of information theory

Elements of information theory
The weighted majority algorithm

Information and Computation
How to use expert advice

Journal of the ACM (JACM)
Some label efficient learning results

COLT '97 Proceedings of the tenth annual conference on Computational learning theory
Apple tasting

Information and Computation
The Nonstochastic Multiarmed Bandit Problem

SIAM Journal on Computing
Potential-Based Algorithms in On-Line Prediction and Game Theory

Machine Learning
Using confidence bounds for exploitation-exploration trade-offs

The Journal of Machine Learning Research
The Value of Knowing a Demand Curve: Bounds on Regret for Online Posted-Price Auctions

FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Online learning in online auctions

Theoretical Computer Science - Special issue: Online algorithms in memoriam, Steve Seiden
Internal Regret in On-Line Portfolio Selection

Machine Learning
Near-optimal online auctions

SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Prediction, Learning, and Games

Prediction, Learning, and Games
From external to internal regret

COLT'05 Proceedings of the 18th annual conference on Learning Theory
Universal prediction

IEEE Transactions on Information Theory
Twofold universal prediction schemes for achieving the finite-state predictability of a noisy individual binary sequence

IEEE Transactions on Information Theory
Universal prediction of individual binary sequences in the presence of noise

IEEE Transactions on Information Theory
Minimizing regret with label efficient prediction

IEEE Transactions on Information Theory

Improved second-order bounds for prediction with expert advice

Machine Learning
Sequential prediction under incomplete feedback

Proceedings of the 2007 conference on Artificial Intelligence Research and Development
Strategies for prediction under imperfect monitoring

COLT'07 Proceedings of the 20th annual conference on Learning theory
Toward a classification of finite partial-monitoring games

ALT'10 Proceedings of the 21st international conference on Algorithmic learning theory
On upper-confidence bound policies for switching bandit problems

ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Improved second-order bounds for prediction with expert advice

COLT'05 Proceedings of the 18th annual conference on Learning Theory
From external to internal regret

COLT'05 Proceedings of the 18th annual conference on Learning Theory
The K-armed dueling bandits problem

Journal of Computer and System Sciences
On robustness and dynamics in (un)balanced coalitional games

Automatica (Journal of IFAC)
Partial monitoring with side information

ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
Toward a classification of finite partial-monitoring games

Theoretical Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider repeated games in which the player, instead of observing the action chosen by the opponent in each game round, receives a feedback generated by the combined choice of the two players. We study Hannan-consistent players for these games, that is, randomized playing strategies whose per-round regret vanishes with probability one as the number n of game rounds goes to infinity. We prove a general lower bound of Ω(n-1/3) for the convergence rate of the regret, and exhibit a specific strategy that attains this rate for any game for which a Hannan-consistent player exists.