On the convergence of the coordinate descent method for convex differentiable minimization
Journal of Optimization Theory and Applications
Exponentiated gradient versus gradient descent for linear predictors
Information and Computation
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Text Categorization Based on Regularized Linear Classification Methods
Information Retrieval
General Convergence Results for Linear Discriminant Updates
Machine Learning
The Robustness of the p-Norm Algorithms
Machine Learning
On-line learning for very large data sets: Research Articles
Applied Stochastic Models in Business and Industry - Statistical Learning
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM
Proceedings of the 24th international conference on Machine learning
An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression
The Journal of Machine Learning Research
Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Efficient projections onto the l1-ball for learning in high dimensions
Proceedings of the 25th international conference on Machine learning
SVM optimization: inverse dependence on training set size
Proceedings of the 25th international conference on Machine learning
Stochastic methods for l1 regularized loss minimization
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization
The Journal of Machine Learning Research
Trading Accuracy for Sparsity in Optimization Problems with Sparsity Constraints
SIAM Journal on Optimization
An optimal method for stochastic composite optimization
Mathematical Programming: Series A and B
Sequential greedy approximation for certain convex optimization problems
IEEE Transactions on Information Theory
Mirror descent and nonlinear projected subgradient methods for convex optimization
Operations Research Letters
Optimal distributed online prediction using mini-batches
The Journal of Machine Learning Research
Sparsity regret bounds for individual sequences in online linear regression
The Journal of Machine Learning Research
Large-scale linear support vector regression
The Journal of Machine Learning Research
Robust feature selection based on regularized brownboost loss
Knowledge-Based Systems
Hi-index | 0.00 |
We describe and analyze two stochastic methods for l1 regularized loss minimization problems, such as the Lasso. The first method updates the weight of a single feature at each iteration while the second method updates the entire weight vector but only uses a single training example at each iteration. In both methods, the choice of feature or example is uniformly at random. Our theoretical runtime analysis suggests that the stochastic methods should outperform state-of-the-art deterministic approaches, including their deterministic counterparts, when the size of the problem is large. We demonstrate the advantage of stochastic methods by experimenting with synthetic and natural data sets.