Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization

Authors:
Lin Xiao
Affiliations:
-
Venue:
The Journal of Machine Learning Research
Year:
2010

Citing 37
Cited 14

Acceleration of stochastic approximation by averaging

SIAM Journal on Control and Optimization
On the convergence of the exponential multiplier method for convex programming

Mathematical Programming: Series A and B
Exponentiated gradient versus gradient descent for linear predictors

Information and Computation
Atomic Decomposition by Basis Pursuit

SIAM Journal on Scientific Computing
A Modified Forward-Backward Splitting Method for Maximal Monotone Mappings

SIAM Journal on Control and Optimization
An Incremental Gradient(-Projection) Method with Momentum Term and Adaptive Stepsize Rule

SIAM Journal on Optimization
Convergence Rates in Forward--Backward Splitting

SIAM Journal on Optimization
Incremental Subgradient Methods for Nondifferentiable Optimization

SIAM Journal on Optimization
Interior-Point Methods for Massive Support Vector Machines

SIAM Journal on Optimization
The Robustness of the p-Norm Algorithms

Machine Learning
Convex Optimization

Convex Optimization
Solving large scale linear prediction problems using stochastic gradient descent algorithms

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Smooth minimization of non-smooth functions

Mathematical Programming: Series A and B
Recursive Aggregation of Estimators by the Mirror Descent Algorithm with Averaging

Problems of Information Transmission
Interior Gradient and Proximal Methods for Convex and Conic Optimization

SIAM Journal on Optimization
Prediction, Learning, and Games

Prediction, Learning, and Games
Scalable training of L1-regularized log-linear models

Proceedings of the 24th international conference on Machine learning
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM

Proceedings of the 24th international conference on Machine learning
An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression

The Journal of Machine Learning Research
Iterated Hard Shrinkage for Minimization Problems with Sparsity Constraints

SIAM Journal on Scientific Computing
Confidence level solutions for stochastic programming

Automatica (Journal of IFAC)
Efficient projections onto the l1-ball for learning in high dimensions

Proceedings of the 25th international conference on Machine learning
Algorithms for Sparse Linear Classifiers in the Massive Data Setting

The Journal of Machine Learning Research
Primal-dual subgradient methods for convex problems

Mathematical Programming: Series A and B - Series B - Special Issue: Nonsmooth Optimization and Applications
Stochastic methods for l1 regularized loss minimization

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Sparse Online Learning via Truncated Gradient

The Journal of Machine Learning Research
Sparse reconstruction by separable approximation

IEEE Transactions on Signal Processing
Robust Stochastic Approximation Approach to Stochastic Programming

SIAM Journal on Optimization
Incremental Stochastic Subgradient Algorithms for Convex Optimization

SIAM Journal on Optimization
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems

SIAM Journal on Imaging Sciences
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems

SIAM Journal on Imaging Sciences
Efficient Online and Batch Learning Using Forward Backward Splitting

The Journal of Machine Learning Research
A Randomized Incremental Subgradient Method for Distributed Optimization in Networked Systems

SIAM Journal on Optimization
Primal-dual first-order methods with $${\mathcal {O}(1/\epsilon)}$$iteration-complexity for cone programming

Mathematical Programming: Series A and B
Adaptive Subgradient Methods for Online Learning and Stochastic Optimization

The Journal of Machine Learning Research
Logarithmic regret algorithms for online convex optimization

COLT'06 Proceedings of the 19th annual conference on Learning Theory
An optimal method for stochastic composite optimization

Mathematical Programming: Series A and B

Stochastic Methods for l1-regularized Loss Minimization

The Journal of Machine Learning Research
Proximal Methods for Hierarchical Sparse Coding

The Journal of Machine Learning Research
Optimization with Sparsity-Inducing Penalties

Foundations and Trends® in Machine Learning
Optimal distributed online prediction using mini-batches

The Journal of Machine Learning Research
Manifold identification in dual averaging for regularized stochastic online learning

The Journal of Machine Learning Research
A network of spiking neurons for computing sparse representations in an energy-efficient way

Neural Computation
Descriptor learning using convex optimisation

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part I
Stochastic coordinate descent methods for regularized smooth and nonsmooth losses

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Sublinear algorithms for penalized logistic regression in massive datasets

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Improving confidence of dual averaging stochastic online learning via aggregation

KI'12 Proceedings of the 35th Annual German conference on Advances in Artificial Intelligence
Constrained stochastic gradient descent for large-scale least squares problem

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient online learning for multitask feature selection

ACM Transactions on Knowledge Discovery from Data (TKDD)
Sparsity regret bounds for individual sequences in online linear regression

The Journal of Machine Learning Research
Community question topic categorization via hierarchical kernelized classification

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider regularized stochastic learning and online optimization problems, where the objective function is the sum of two convex terms: one is the loss function of the learning task, and the other is a simple regularization term such as l1-norm for promoting sparsity. We develop extensions of Nesterov's dual averaging method, that can exploit the regularization structure in an online setting. At each iteration of these methods, the learning variables are adjusted by solving a simple minimization problem that involves the running average of all past subgradients of the loss function and the whole regularization term, not just its subgradient. In the case of l1-regularization, our method is particularly effective in obtaining sparse solutions. We show that these methods achieve the optimal convergence rates or regret bounds that are standard in the literature on stochastic and online convex optimization. For stochastic learning problems in which the loss functions have Lipschitz continuous gradients, we also present an accelerated version of the dual averaging method.