Logarithmic regret algorithms for online convex optimization

Authors:
Elad Hazan;Amit Agarwal;Satyen Kale
Affiliations:
IBM Almaden Research Center, San Jose, USA 95120;Department of Computer Science, Princeton University, Princeton, USA;Department of Computer Science, Princeton University, Princeton, USA
Venue:
Machine Learning
Year:
2007

Citing 11
Cited 21

Universal sequential learning and decision from individual data sequences

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
A Sherman-Morrison-Woodbury identity for rank augmenting matrices with application to centering

SIAM Journal on Matrix Analysis and Applications
A new algorithm for minimizing convex functions over convex sets

Mathematical Programming: Series A and B
Universal portfolios with and without transaction costs

COLT '97 Proceedings of the tenth annual conference on Computational learning theory
Relative loss bounds for multidimensional regression problems

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Efficient algorithms for universal portfolios

The Journal of Machine Learning Research
Simulated Annealing in Convex Bodies and an 0*(n4) Volume Algorithm

FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Convex Optimization

Convex Optimization
Efficient algorithms for online decision problems

Journal of Computer and System Sciences - Special issue: Learning theory 2003
Prediction, Learning, and Games

Prediction, Learning, and Games
Efficient algorithms for online convex optimization and their applications

Efficient algorithms for online convex optimization and their applications

Aggregating Algorithm for a Space of Analytic Functions

ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
On the convergence of regret minimization dynamics in concave games

Proceedings of the forty-first annual ACM symposium on Theory of computing
Proximal regularization for online and batch learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Regret-based online ranking for a growing digital library

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Universal FIR MMSE filtering

IEEE Transactions on Signal Processing
Probabilistic structured predictors

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
A new understanding of prediction markets via no-regret learning

Proceedings of the 11th ACM conference on Electronic commerce
Semantics-preserving bag-of-words models and applications

IEEE Transactions on Image Processing
Competitive online generalized linear regression under square loss

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
No regret learning in oligopolies: cournot vs. bertrand

SAGT'10 Proceedings of the Third international conference on Algorithmic game theory
Mining social images with distance metric learning for automated image tagging

Proceedings of the fourth ACM international conference on Web search and data mining
Meta optimization and its application to portfolio selection

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Sublinear optimization for machine learning

Journal of the ACM (JACM)
Group tracking: exploring mutual relations for multiple object tracking

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part III
Online optimization with switching cost

ACM SIGMETRICS Performance Evaluation Review
A tale of two metrics: simultaneous bounds on competitiveness and regret

Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems
Efficient Market Making via Convex Optimization, and a Connection to Online Learning

ACM Transactions on Economics and Computation - Special Issue on Algorithmic Game Theory
Efficient online learning for multitask feature selection

ACM Transactions on Knowledge Discovery from Data (TKDD)
Trading regret for efficiency: online convex optimization with long term constraints

The Journal of Machine Learning Research
Selective sampling and active learning from single and multiple teachers

The Journal of Machine Learning Research
Online portfolio selection: A survey

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In an online convex optimization problem a decision-maker makes a sequence of decisions, i.e., chooses a sequence of points in Euclidean space, from a fixed feasible set. After each point is chosen, it encounters a sequence of (possibly unrelated) convex cost functions. Zinkevich (ICML 2003) introduced this framework, which models many natural repeated decision-making problems and generalizes many existing problems such as Prediction from Expert Advice and Cover's Universal Portfolios. Zinkevich showed that a simple online gradient descent algorithm achieves additive regret $O(\sqrt{T})$ , for an arbitrary sequence of T convex cost functions (of bounded gradients), with respect to the best single decision in hindsight. In this paper, we give algorithms that achieve regret O(log驴(T)) for an arbitrary sequence of strictly convex functions (with bounded first and second derivatives). This mirrors what has been done for the special cases of prediction from expert advice by Kivinen and Warmuth (EuroCOLT 1999), and Universal Portfolios by Cover (Math. Finance 1:1---19, 1991). We propose several algorithms achieving logarithmic regret, which besides being more general are also much more efficient to implement. The main new ideas give rise to an efficient algorithm based on the Newton method for optimization, a new tool in the field. Our analysis shows a surprising connection between the natural follow-the-leader approach and the Newton method. We also analyze other algorithms, which tie together several different previous approaches including follow-the-leader, exponential weighting, Cover's algorithm and gradient descent.