Logarithmic regret algorithms for online convex optimization

Authors:
Elad Hazan;Adam Kalai;Satyen Kale;Amit Agarwal
Affiliations:
Princeton University;TTI, Chicago;Princeton University;Princeton University
Venue:
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Year:
2006

Citing 10
Cited 24

A Sherman-Morrison-Woodbury identity for rank augmenting matrices with application to centering

SIAM Journal on Matrix Analysis and Applications
A new algorithm for minimizing convex functions over convex sets

Mathematical Programming: Series A and B
Universal portfolios with and without transaction costs

COLT '97 Proceedings of the tenth annual conference on Computational learning theory
Averaging Expert Predictions

EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Introduction to Stochastic Search and Optimization

Introduction to Stochastic Search and Optimization
Efficient algorithms for universal portfolios

The Journal of Machine Learning Research
Simulated Annealing in Convex Bodies and an 0*(n4) Volume Algorithm

FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Online convex optimization in the bandit setting: gradient descent without a gradient

SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Efficient algorithms for online decision problems

Journal of Computer and System Sciences - Special issue: Learning theory 2003
Prediction, Learning, and Games

Prediction, Learning, and Games

Algorithms for portfolio management based on the Newton method

ICML '06 Proceedings of the 23rd international conference on Machine learning
Maximum margin planning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM

Proceedings of the 24th international conference on Machine learning
Rank minimization via online learning

Proceedings of the 25th international conference on Machine learning
A Nonparametric Asymptotic Analysis of Inventory Planning with Censored Demand

Mathematics of Operations Research
Efficient learning algorithms for changing environments

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
CHOMP: gradient optimization techniques for efficient motion planning

ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Efficient Online and Batch Learning Using Forward Backward Splitting

The Journal of Machine Learning Research
On-line estimation with the multivariate Gaussian distribution

COLT'07 Proceedings of the 20th annual conference on Learning theory
Online learning with prior knowledge

COLT'07 Proceedings of the 20th annual conference on Learning theory
Online learning with queries

SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Online learning in adversarial Lipschitz environments

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Optimization and learning for rough terrain legged locomotion

International Journal of Robotics Research
Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization

The Journal of Machine Learning Research
Serendipitous learning: learning beyond the predefined label space

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Adaptive Subgradient Methods for Online Learning and Stochastic Optimization

The Journal of Machine Learning Research
Efficient Learning with Partially Observed Attributes

The Journal of Machine Learning Research
NASA: achieving lower regrets and faster rates via adaptive stepsizes

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Manifold identification in dual averaging for regularized stochastic online learning

The Journal of Machine Learning Research
Maxi-Min discriminant analysis via online learning

Neural Networks
On ensemble techniques for AIXI approximation

AGI'12 Proceedings of the 5th international conference on Artificial General Intelligence
Online portfolio selection: A survey

ACM Computing Surveys (CSUR)
CHOMP: Covariant Hamiltonian optimization for motion planning

International Journal of Robotics Research
Communication-efficient algorithms for statistical optimization

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

In an online convex optimization problem a decision-maker makes a sequence of decisions, i.e., chooses a sequence of points in Euclidean space, from a fixed feasible set. After each point is chosen, it encounters a sequence of (possibly unrelated) convex cost functions. Zinkevich [Zin03] introduced this framework, which models many natural repeated decision-making problems and generalizes many existing problems such as Prediction from Expert Advice and Cover’s Universal Portfolios. Zinkevich showed that a simple online gradient descent algorithm achieves additive regret$O({\sqrt{T}})$, for an arbitrary sequence of T convex cost functions (of bounded gradients), with respect to the best single decision in hindsight. In this paper, we give algorithms that achieve regret O(log(T)) for an arbitrary sequence of strictly convex functions (with bounded first and second derivatives). This mirrors what has been done for the special cases of prediction from expert advice by Kivinen and Warmuth [KW99], and Universal Portfolios by Cover [Cov91]. We propose several algorithms achieving logarithmic regret, which besides being more general are also much more efficient to implement. The main new ideas give rise to an efficient algorithm based on the Newton method for optimization, a new tool in the field. Our analysis shows a surprising connection to follow-the-leader method, and builds on the recent work of Agarwal and Hazan [AH05]. We also analyze other algorithms, which tie together several different previous approaches including follow-the-leader, exponential weighting, Cover’s algorithm and gradient descent.