Stochastic methods for l1 regularized loss minimization

Authors:
Shai Shalev-Shwartz;Ambuj Tewari
Affiliations:
Toyota Technological Institute at Chicago, Chicago, IL;Toyota Technological Institute at Chicago, Chicago, IL
Venue:
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Year:
2009

Citing 15
Cited 15

On the convergence of the coordinate descent method for convex differentiable minimization

Journal of Optimization Theory and Applications
Exponentiated gradient versus gradient descent for linear predictors

Information and Computation
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Text Categorization Based on Regularized Linear Classification Methods

Information Retrieval
General Convergence Results for Linear Discriminant Updates

Machine Learning
Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm

Machine Learning
The Robustness of the p-Norm Algorithms

Machine Learning
On-line learning for very large data sets: Research Articles

Applied Stochastic Models in Business and Industry - Statistical Learning
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM

Proceedings of the 24th international conference on Machine learning
An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression

The Journal of Machine Learning Research
Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Efficient projections onto the l1-ball for learning in high dimensions

Proceedings of the 25th international conference on Machine learning
SVM optimization: inverse dependence on training set size

Proceedings of the 25th international conference on Machine learning
Sequential greedy approximation for certain convex optimization problems

IEEE Transactions on Information Theory
Mirror descent and nonlinear projected subgradient methods for convex optimization

Operations Research Letters

Efficient Online and Batch Learning Using Forward Backward Splitting

The Journal of Machine Learning Research
Estimating advertisability of tail queries for sponsored search

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Efficient and numerically stable sparse learning

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization

The Journal of Machine Learning Research
A Comparison of Optimization Methods and Software for Large-scale L1-regularized Linear Classification

The Journal of Machine Learning Research
Stochastic Methods for l1-regularized Loss Minimization

The Journal of Machine Learning Research
Adaptive and optimal online linear regression on l1-balls

ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Optimization with Sparsity-Inducing Penalties

Foundations and Trends® in Machine Learning
An improved GLMNET for L1-regularized logistic regression

The Journal of Machine Learning Research
Stochastic coordinate descent methods for regularized smooth and nonsmooth losses

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Stochastic dual coordinate ascent methods for regularized loss

The Journal of Machine Learning Research
Sparse high-dimensional fractional-norm support vector machine via DC programming

Computational Statistics & Data Analysis
Predicting response in mobile advertising with hierarchical importance-aware factorization machine

Proceedings of the 7th ACM international conference on Web search and data mining
Adaptive and optimal online linear regression on ℓ1-balls

Theoretical Computer Science
1-Norm extreme learning machine for regression and multiclass classification using Newton method

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe and analyze two stochastic methods for l1 regularized loss minimization problems, such as the Lasso. The first method updates the weight of a single feature at each iteration while the second method updates the entire weight vector but only uses a single training example at each iteration. In both methods, the choice of feature/example is uniformly at random. Our theoretical runtime analysis suggests that the stochastic methods should outperform state-of-the-art deterministic approaches, including their deterministic counterparts, when the size of the problem is large. We demonstrate the advantage of stochastic methods by experimenting with synthetic and natural data sets.