Sharp dichotomies for regret minimization in metric spaces

Authors:
Robert Kleinberg;Aleksandrs Slivkins
Affiliations:
Cornell University, Ithaca, NY;Microsoft Research, Mountain View, CA
Venue:
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Year:
2010

Citing 19
Cited 3

Elements of information theory

Elements of information theory
The Continuum-Armed Bandit Problem

SIAM Journal on Control and Optimization
How to use expert advice

Journal of the ACM (JACM)
A game of prediction with expert advice

Journal of Computer and System Sciences - Special issue on the eighth annual workshop on computational learning theory, July 5–8, 1995
The Nonstochastic Multiarmed Bandit Problem

SIAM Journal on Computing
Finite-time Analysis of the Multiarmed Bandit Problem

Machine Learning
Using confidence bounds for exploitation-exploration trade-offs

The Journal of Machine Learning Research
Online convex optimization in the bandit setting: gradient descent without a gradient

SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Robbing the bandit: less regret in online geometric optimization against an adaptive adversary

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Prediction, Learning, and Games

Prediction, Learning, and Games
Online decision problems with large strategy sets

Online decision problems with large strategy sets
Playing games with approximation algorithms

Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
Online linear optimization and adaptive routing

Journal of Computer and System Sciences
Approximation Algorithms for Partial-Information Based Stochastic Control with Markovian Rewards

FOCS '07 Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science
Multi-armed bandits in metric spaces

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Approximation algorithms for restless bandit problems

SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Better algorithms for benign bandits

SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Improved rates for the stochastic continuum-armed bandit problem

COLT'07 Proceedings of the 20th annual conference on Learning theory
Online learning with prior knowledge

COLT'07 Proceedings of the 20th annual conference on Learning theory

Pure exploration in finitely-armed and continuous-armed bandits

Theoretical Computer Science
Convergence Rates of Efficient Global Optimization Algorithms

The Journal of Machine Learning Research
Ranked bandits in metric spaces: learning diverse rankings over large document collections

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Lipschitz multi-armed bandit (MAB) problem generalizes the classical multi-armed bandit problem by assuming one is given side information consisting of a priori upper bounds on the difference in expected payoff between certain pairs of strategies. Classical results of Lai-Robbins [28] and Auer et al. [3] imply a logarithmic regret bound for the Lipschitz MAB problem on finite metric spaces. Recent results on continuum-armed bandit problems and their generalizations imply lower bounds of √t, or stronger, for many infinite metric spaces such as the unit interval. Is this dichotomy universal? We prove that the answer is yes: for every metric space, the optimal regret of a Lipschitz MAB algorithm is either bounded above by any f ε w(log t), or bounded below by any g ∈ o(√t). Perhaps surprisingly, this dichotomy does not coincide with the distinction between finite and infinite metric spaces; instead it depends on whether the completion of the metric space is compact and countable. Our proof connects upper and lower bound techniques in online learning with classical topological notions such as perfect sets and the Cantor-Bendixson theorem. We also consider the full-feedback (a.k.a., best-expert) version of Lipschitz MAB problem, termed the Lipschitz experts problem, and show that this problem exhibits a similar dichotomy. We proceed to give nearly matching upper and lower bounds on regret in the Lipschitz experts problem on uncountable metric spaces. These bounds are of the form Θ(tγ), where the exponent γ ε [1/2, 1] depends on the metric space. To characterize this dependence, we introduce a novel dimensionality notion tailored to the experts problem. Finally, we show that both Lipschitz bandits and Lipschitz experts problems become completely intractable (in the sense that no algorithm has regret o(t)) if and only if the completion of the metric space is non-compact.