Efficient algorithms for online decision problems

Authors:
Adam Kalai;Santosh Vempala
Affiliations:
Department of Computer Science, Toyota Technological Institute, 1427 E. 60th St., Chicago, IL 60637, USA;Massachusetts Institute of Technology, MA, USA
Venue:
Journal of Computer and System Sciences - Special issue: Learning theory 2003
Year:
2005

Citing 17
Cited 52

Amortized efficiency of list update and paging rules

Communications of the ACM
Self-adjusting binary search trees

Journal of the ACM (JACM)
Dynamic Huffman coding

Journal of Algorithms
From on-line to batch learning

COLT '89 Proceedings of the second annual workshop on Computational learning theory
Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming

Journal of the ACM (JACM)
Predicting Nearly As Well As the Best Pruning of a Decision Tree

Machine Learning - Special issue on the eighth annual conference on computational learning theory, (COLT '95)
Using and combining predictors that specialize

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Approximating the bandwidth via volume respecting embeddings (extended abstract)

STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems

Journal of the ACM (JACM)
Small distortion and volume preserving embeddings for planar and Euclidean metrics

SCG '99 Proceedings of the fifteenth annual symposium on Computational geometry
Approximation algorithms

Approximation algorithms
Predicting nearly as well as the best pruning of a planar decision graph

Theoretical Computer Science
On Euclidean Embeddings and Bandwidth Minimization

APPROX '01/RANDOM '01 Proceedings of the 4th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems and 5th International Workshop on Randomization and Approximation Techniques in Computer Science: Approximation, Randomization and Combinatorial Optimization
Approximation Algorithms for Classification Problems with Pairwise Relationships: Metric Labeling and Markov Random Fields

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Adapting to a reliable network path

Proceedings of the twenty-second annual symposium on Principles of distributed computing
Path kernels and multiplicative updates

The Journal of Machine Learning Research
Adaptive routing with end-to-end feedback: distributed learning and geometric approaches

STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing

Approximation algorithms and online mechanisms for item pricing

EC '06 Proceedings of the 7th ACM conference on Electronic commerce
Algorithms for portfolio management based on the Newton method

ICML '06 Proceedings of the 23rd international conference on Machine learning
Routing without regret: on convergence to nash equilibria of regret-minimizing algorithms in routing games

Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing
Perspectives on multiagent learning

Artificial Intelligence
Playing games with approximation algorithms

Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
Online discovery of similarity mappings

Proceedings of the 24th international conference on Machine learning
Logarithmic regret algorithms for online convex optimization

Machine Learning
Online linear optimization and adaptive routing

Journal of Computer and System Sciences
Regret based dynamics: convergence in weakly acyclic games

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Nonstochastic bandits: Countable decision set, unbounded costs and reactive environments

Theoretical Computer Science
Regret to the best vs. regret to the average

Machine Learning
Rank minimization via online learning

Proceedings of the 25th international conference on Machine learning
SIGACT news online algorithms column 13: 2007 - an offine perspective

ACM SIGACT News
Following the Perturbed Leader to Gamble at Multi-armed Bandits

ALT '07 Proceedings of the 18th international conference on Algorithmic Learning Theory
Markov Decision Processes with Arbitrary Reward Processes

Recent Advances in Reinforcement Learning
Online Markov Decision Processes

Mathematics of Operations Research
Markov Decision Processes with Arbitrary Reward Processes

Mathematics of Operations Research
Route planning under uncertainty: the Canadian traveller problem

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Computing equilibria in multiplayer stochastic games of imperfect information

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Online learning in Markov decision processes with arbitrarily changing rewards and transitions

GameNets'09 Proceedings of the First ICST international conference on Game Theory for Networks
Regret Minimization and Job Scheduling

SOFSEM '10 Proceedings of the 36th Conference on Current Trends in Theory and Practice of Computer Science
Learning Permutations with Exponential Weights

The Journal of Machine Learning Research
Regret to the best vs. regret to the average

COLT'07 Proceedings of the 20th annual conference on Learning theory
Learning permutations with exponential weights

COLT'07 Proceedings of the 20th annual conference on Learning theory
Multitask learning with expert advice

COLT'07 Proceedings of the 20th annual conference on Learning theory
Differential privacy under continual observation

Proceedings of the forty-second ACM symposium on Theory of computing
A new understanding of prediction markets via no-regret learning

Proceedings of the 11th ACM conference on Electronic commerce
The follow perturbed leader algorithm protected from unbounded one-step losses

ALT'09 Proceedings of the 20th international conference on Algorithmic learning theory
Learning volatility of discrete time series using prediction with expert advice

SAGA'09 Proceedings of the 5th international conference on Stochastic algorithms: foundations and applications
Online algorithms for the newsvendor problem with and without censored demands

FAW'10 Proceedings of the 4th international conference on Frontiers in algorithmics
Approximation algorithms for reliable stochastic combinatorial optimization

APPROX/RANDOM'10 Proceedings of the 13th international conference on Approximation, and 14 the International conference on Randomization, and combinatorial optimization: algorithms and techniques
Network-wide deployment of intrusion detection and prevention systems

Proceedings of the 6th International COnference
Online Learning in Case of Unbounded Losses Using Follow the Perturbed Leader Algorithm

The Journal of Machine Learning Research
GSP auctions with correlated types

Proceedings of the 12th ACM conference on Electronic commerce
Dueling algorithms

Proceedings of the forty-third annual ACM symposium on Theory of computing
Meta optimization and its application to portfolio selection

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Better Algorithms for Benign Bandits

The Journal of Machine Learning Research
Adaptive Subgradient Methods for Online Learning and Stochastic Optimization

The Journal of Machine Learning Research
Combining initial segments of lists

ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Online Optimization with Uncertain Information

ACM Transactions on Algorithms (TALG)
Logarithmic regret algorithms for online convex optimization

COLT'06 Proceedings of the 19th annual conference on Learning Theory
Quasi-proportional mechanisms: prior-free revenue maximization

LATIN'10 Proceedings of the 9th Latin American conference on Theoretical Informatics
Optimum follow the leader algorithm

COLT'05 Proceedings of the 18th annual conference on Learning Theory
Online Learning and Online Convex Optimization

Foundations and Trends® in Machine Learning
Combinatorial bandits

Journal of Computer and System Sciences
Randomized sensing in adversarial environments

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Online prediction under submodular constraints

ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
Lower bounds on individual sequence regret

ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
New algorithms for budgeted learning

Machine Learning
Online Multiple Kernel Classification

Machine Learning
Online submodular minimization

The Journal of Machine Learning Research
Combining initial segments of lists

Theoretical Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

In an online decision problem, one makes a sequence of decisions without knowledge of the future. Each period, one pays a cost based on the decision and observed state. We give a simple approach for doing nearly as well as the best single decision, where the best is chosen with the benefit of hindsight. A natural idea is to follow the leader, i.e. each period choose the decision which has done best so far. We show that by slightly perturbing the totals and then choosing the best decision, the expected performance is nearly as good as the best decision in hindsight. Our approach, which is very much like Hannan's original game-theoretic approach from the 1950s, yields guarantees competitive with the more modern exponential weighting algorithms like Weighted Majority. More importantly, these follow-the-leader style algorithms extend naturally to a large class of structured online problems for which the exponential algorithms are inefficient.