Online linear optimization and adaptive routing

Authors:
Baruch Awerbuch;Robert Kleinberg
Affiliations:
Department of Computer Science, Johns Hopkins University, 3400 N. Charles Street, Baltimore, MD 21218, USA;Department of Computer Science, Cornell University, Ithaca, NY 14853, USA
Venue:
Journal of Computer and System Sciences
Year:
2008

Citing 13
Cited 6

The weighted majority algorithm

Information and Computation
The Continuum-Armed Bandit Problem

SIAM Journal on Control and Optimization
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Introduction to Algorithms

Introduction to Algorithms
The Nonstochastic Multiarmed Bandit Problem

SIAM Journal on Computing
Adapting to a reliable network path

Proceedings of the twenty-second annual symposium on Principles of distributed computing
Path kernels and multiplicative updates

The Journal of Machine Learning Research
Adaptive routing with end-to-end feedback: distributed learning and geometric approaches

STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
An algebraic approach to practical and scalable overlay network monitoring

Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
Online convex optimization in the bandit setting: gradient descent without a gradient

SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Robbing the bandit: less regret in online geometric optimization against an adaptive adversary

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Efficient algorithms for online decision problems

Journal of Computer and System Sciences - Special issue: Learning theory 2003
Computing the unmeasured: an algebraic approach to Internet mapping

IEEE Journal on Selected Areas in Communications

Multi-armed bandits in metric spaces

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
Regret Minimization and Job Scheduling

SOFSEM '10 Proceedings of the 36th Conference on Current Trends in Theory and Practice of Computer Science
From optimization to regret minimization and back again

SysML'08 Proceedings of the Third conference on Tackling computer systems problems with machine learning techniques
Sharp dichotomies for regret minimization in metric spaces

SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Understanding and protecting privacy: formal semantics and principled audit mechanisms

ICISS'11 Proceedings of the 7th international conference on Information Systems Security
Ranked bandits in metric spaces: learning diverse rankings over large document collections

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper studies an online linear optimization problem generalizing the multi-armed bandit problem. Motivated primarily by the task of designing adaptive routing algorithms for overlay networks, we present two randomized online algorithms for selecting a sequence of routing paths in a network with unknown edge delays varying adversarially over time. In contrast with earlier work on this problem, we assume that the only feedback after choosing such a path is the total end-to-end delay of the selected path. We present two algorithms whose regret is sublinear in the number of trials and polynomial in the size of the network. The first of these algorithms generalizes to solve any online linear optimization problem, given an oracle for optimizing linear functions over the set of strategies; our work may thus be interpreted as a general-purpose reduction from offline to online linear optimization. A key element of this algorithm is the notion of a barycentric spanner, a special type of basis for the vector space of strategies which allows any feasible strategy to be expressed as a linear combination of basis vectors using bounded coefficients. We also present a second algorithm for the online shortest path problem, which solves the problem using a chain of online decision oracles, one at each node of the graph. This has several advantages over the online linear optimization approach. First, it is effective against an adaptive adversary, whereas our linear optimization algorithm assumes an oblivious adversary. Second, even in the case of an oblivious adversary, the second algorithm performs slightly better than the first, as measured by their additive regret.