On the convergence of the coordinate descent method for convex differentiable minimization
Journal of Optimization Theory and Applications
Exponentiated gradient versus gradient descent for linear predictors
Information and Computation
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Text Categorization Based on Regularized Linear Classification Methods
Information Retrieval
General Convergence Results for Linear Discriminant Updates
Machine Learning
The Robustness of the p-Norm Algorithms
Machine Learning
On-line learning for very large data sets: Research Articles
Applied Stochastic Models in Business and Industry - Statistical Learning
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM
Proceedings of the 24th international conference on Machine learning
An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression
The Journal of Machine Learning Research
Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Efficient projections onto the l1-ball for learning in high dimensions
Proceedings of the 25th international conference on Machine learning
SVM optimization: inverse dependence on training set size
Proceedings of the 25th international conference on Machine learning
Sequential greedy approximation for certain convex optimization problems
IEEE Transactions on Information Theory
Mirror descent and nonlinear projected subgradient methods for convex optimization
Operations Research Letters
Efficient Online and Batch Learning Using Forward Backward Splitting
The Journal of Machine Learning Research
Estimating advertisability of tail queries for sponsored search
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Efficient and numerically stable sparse learning
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization
The Journal of Machine Learning Research
The Journal of Machine Learning Research
Stochastic Methods for l1-regularized Loss Minimization
The Journal of Machine Learning Research
Adaptive and optimal online linear regression on l1-balls
ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Optimization with Sparsity-Inducing Penalties
Foundations and Trends® in Machine Learning
An improved GLMNET for L1-regularized logistic regression
The Journal of Machine Learning Research
Stochastic coordinate descent methods for regularized smooth and nonsmooth losses
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Stochastic dual coordinate ascent methods for regularized loss
The Journal of Machine Learning Research
Sparse high-dimensional fractional-norm support vector machine via DC programming
Computational Statistics & Data Analysis
Predicting response in mobile advertising with hierarchical importance-aware factorization machine
Proceedings of the 7th ACM international conference on Web search and data mining
Adaptive and optimal online linear regression on ℓ1-balls
Theoretical Computer Science
Hi-index | 0.00 |
We describe and analyze two stochastic methods for l1 regularized loss minimization problems, such as the Lasso. The first method updates the weight of a single feature at each iteration while the second method updates the entire weight vector but only uses a single training example at each iteration. In both methods, the choice of feature/example is uniformly at random. Our theoretical runtime analysis suggests that the stochastic methods should outperform state-of-the-art deterministic approaches, including their deterministic counterparts, when the size of the problem is large. We demonstrate the advantage of stochastic methods by experimenting with synthetic and natural data sets.