From external to internal regret

Authors:
Avrim Blum;Yishay Mansour
Affiliations:
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA;School of Computer Science, Tel-Aviv University, Israel
Venue:
COLT'05 Proceedings of the 18th annual conference on Learning Theory
Year:
2005

Citing 13
Cited 6

How to use expert advice

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
A randomization rule for selecting forecasts

Operations Research
The weighted majority algorithm

Information and Computation
Empirical Support for Winnow and Weighted-MajorityAlgorithms: Results on a Calendar Scheduling Domain

Machine Learning
Using and combining predictors that specialize

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Context-sensitive learning methods for text categorization

ACM Transactions on Information Systems (TOIS)
The Nonstochastic Multiarmed Bandit Problem

SIAM Journal on Computing
Potential-Based Algorithms in On-Line Prediction and Game Theory

Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting

EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
Adaptive and Self-Confident On-Line Learning Algorithms

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Internal Regret in On-Line Portfolio Selection

Machine Learning
Regret Minimization Under Partial Monitoring

Mathematics of Operations Research
Improved second-order bounds for prediction with expert advice

COLT'05 Proceedings of the 18th annual conference on Learning Theory

Regret Minimization Under Partial Monitoring

Mathematics of Operations Research
The communication complexity of uncoupled nash equilibrium procedures

Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
No-regret learning in convex games

Proceedings of the 25th international conference on Machine learning
On Fixed Convex Combinations of No-Regret Learners

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Understanding and protecting privacy: formal semantics and principled audit mechanisms

ICISS'11 Proceedings of the 7th international conference on Information Systems Security
A tale of two metrics: simultaneous bounds on competitiveness and regret

Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

External regret compares the performance of an online algorithm, selecting among N actions, to the performance of the best of those actions in hindsight. Internal regret compares the loss of an online algorithm to the loss of a modified online algorithm, which consistently replaces one action by another. In this paper, we give a simple generic reduction that, given an algorithm for the external regret problem, converts it to an efficient online algorithm for the internal regret problem. We provide methods that work both in the full information model, in which the loss of every action is observed at each time step, and the partial information (bandit) model, where at each time step only the loss of the selected action is observed. The importance of internal regret in game theory is due to the fact that in a general game, if each player has sublinear internal regret, then the empirical frequencies converge to a correlated equilibrium. For external regret we also derive a quantitative regret bound for a very general setting of regret, which includes an arbitrary set of modification rules (that possibly modify the online algorithm) and an arbitrary set of time selection functions (each giving different weight to each time step). The regret for a given time selection and modification rule is the difference between the cost of the online algorithm and the cost of the modified online algorithm, where the costs are weighted by the time selection function. This can be viewed as a generalization of the previously-studied sleeping experts setting.