Stochastic dual coordinate ascent methods for regularized loss

Authors:
Shai Shalev-Shwartz;Tong Zhang
Affiliations:
Benin school of Computer Science and Engineering, The Hebrew University, Jerusalem, Israel;Department of Statistics, Rutgers University, Piscataway, NJ
Venue:
The Journal of Machine Learning Research
Year:
2013

Citing 12
Cited 1

On the convergence of the coordinate descent method for convex differentiable minimization

Journal of Optimization Theory and Applications
Making large-scale support vector machine learning practical

Advances in kernel methods
Fast training of support vector machines using sequential minimal optimization

Advances in kernel methods
A statistical study of on-line learning

On-line learning in neural networks
Solving large scale linear prediction problems using stochastic gradient descent algorithms

ICML '04 Proceedings of the twenty-first international conference on Machine learning
QP Algorithms with Guaranteed Accuracy and Run Time for Support Vector Machines

The Journal of Machine Learning Research
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM

Proceedings of the 24th international conference on Machine learning
A dual coordinate descent method for large-scale linear SVM

Proceedings of the 25th international conference on Machine learning
SVM optimization: inverse dependence on training set size

Proceedings of the 25th international conference on Machine learning
Exponentiated Gradient Algorithms for Conditional Random Fields and Max-Margin Markov Networks

The Journal of Machine Learning Research
Stochastic methods for l1 regularized loss minimization

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Successive overrelaxation for support vector machines

IEEE Transactions on Neural Networks

RankCNN: When learning to rank encounters the pseudo preference feedback

Computer Standards & Interfaces

Quantified Score

Hi-index	0.00

Visualization

Abstract

Stochastic Gradient Descent (SGD) has become popular for solving large scale supervised machine learning optimization problems such as SVM, due to their strong theoretical guarantees. While the closely related Dual Coordinate Ascent (DCA) method has been implemented in various software packages, it has so far lacked good convergence analysis. This paper presents a new analysis of Stochastic Dual Coordinate Ascent (SDCA) showing that this class of methods enjoy strong theoretical guarantees that are comparable or better than SGD. This analysis justifies the effectiveness of SDCA for practical applications.