On the limited memory BFGS method for large scale optimization
Mathematical Programming: Series A and B
A limited memory algorithm for bound constrained optimization
SIAM Journal on Scientific Computing
Exponentiated gradient versus gradient descent for linear predictors
Information and Computation
Making large-scale support vector machine learning practical
Advances in kernel methods
Newton's Method for Large Bound-Constrained Optimization Problems
SIAM Journal on Optimization
Text Categorization Based on Regularized Linear Classification Methods
Information Retrieval
Feature Selection via Concave Minimization and Support Vector Machines
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Adaptive Sparseness for Supervised Learning
IEEE Transactions on Pattern Analysis and Machine Intelligence
Grafting: fast, incremental feature selection by gradient descent in function space
The Journal of Machine Learning Research
A Feature Selection Newton Method for Support Vector Machine Classification
Computational Optimization and Applications
Gradient LASSO for feature selection
ICML '04 Proceedings of the twenty-first international conference on Machine learning
A Bayesian Approach to Joint Feature Selection and Classifier Design
IEEE Transactions on Pattern Analysis and Machine Intelligence
Sparse Multinomial Logistic Regression: Fast Algorithms and Generalization Bounds
IEEE Transactions on Pattern Analysis and Machine Intelligence
Evaluation and extension of maximum entropy models with inequality constraints
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Exact 1-Norm Support Vector Machines Via Unconstrained Convex Differentiable Minimization
The Journal of Machine Learning Research
On Model Selection Consistency of Lasso
The Journal of Machine Learning Research
Scalable training of L1-regularized log-linear models
Proceedings of the 24th international conference on Machine learning
An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression
The Journal of Machine Learning Research
Efficient projections onto the l1-ball for learning in high dimensions
Proceedings of the 25th international conference on Machine learning
A dual coordinate descent method for large-scale linear SVM
Proceedings of the 25th international conference on Machine learning
The Journal of Machine Learning Research
Trust Region Newton Method for Logistic Regression
The Journal of Machine Learning Research
A coordinate gradient descent method for nonsmooth separable minimization
Mathematical Programming: Series A and B
Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches
ECML '07 Proceedings of the 18th European conference on Machine Learning
Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines
The Journal of Machine Learning Research
LIBLINEAR: A Library for Large Linear Classification
The Journal of Machine Learning Research
Boosting with structural sparsity
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Efficient Euclidean projections in linear time
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Stochastic methods for l1 regularized loss minimization
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Large-scale sparse logistic regression
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Sparse Online Learning via Truncated Gradient
The Journal of Machine Learning Research
EfficientL1regularized logistic regression
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Sparse reconstruction by separable approximation
IEEE Transactions on Signal Processing
Fixed-Point Continuation for $\ell_1$-Minimization: Methodology and Convergence
SIAM Journal on Optimization
Bundle Methods for Regularized Risk Minimization
The Journal of Machine Learning Research
A Fast Hybrid Algorithm for Large-Scale l1-Regularized Logistic Regression
The Journal of Machine Learning Research
Iterative Scaling and Coordinate Descent Methods for Maximum Entropy Models
The Journal of Machine Learning Research
A Quasi-Newton Approach to Nonsmooth Convex Optimization Problems in Machine Learning
The Journal of Machine Learning Research
A coordinate gradient descent method for l1-regularized convex minimization
Computational Optimization and Applications
Fast Solution of -Norm Minimization Problems When the Solution May Be Sparse
IEEE Transactions on Information Theory
IEEE Transactions on Neural Networks
A Fast Tracking Algorithm for Generalized LARS/LASSO
IEEE Transactions on Neural Networks
Training and Testing Low-degree Polynomial Data Mappings via Linear SVM
The Journal of Machine Learning Research
An improved GLMNET for l1-regularized logistic regression
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Proximal Methods for Hierarchical Sparse Coding
The Journal of Machine Learning Research
Structured Variable Selection with Sparsity-Inducing Norms
The Journal of Machine Learning Research
A novel feature selection method based on normalized mutual information
Applied Intelligence
An improved GLMNET for L1-regularized logistic regression
The Journal of Machine Learning Research
Learning class-to-image distance via large margin and l1-norm regularization
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part II
Stochastic coordinate descent methods for regularized smooth and nonsmooth losses
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Feature reduction for efficient object detection via l1-norm latent SVM
IScIDE'12 Proceedings of the third Sino-foreign-interchange conference on Intelligent Science and Intelligent Data Engineering
Learning non-linear classifiers with a sparsity constraint using L1 regularization
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Fast training of effective multi-class boosting using coordinate descent optimization
ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part II
Multi-target regression with rule ensembles
The Journal of Machine Learning Research
Large-scale linear support vector regression
The Journal of Machine Learning Research
Information Technology and Management
Hi-index | 0.00 |
Large-scale linear classification is widely used in many areas. The L1-regularized form can be applied for feature selection; however, its non-differentiability causes more difficulties in training. Although various optimization methods have been proposed in recent years, these have not yet been compared suitably. In this paper, we first broadly review existing methods. Then, we discuss state-of-the-art software packages in detail and propose two efficient implementations. Extensive comparisons indicate that carefully implemented coordinate descent methods are very suitable for training large document data.