Machine Learning
On Model Selection Consistency of Lasso
The Journal of Machine Learning Research
Consistency of the Group Lasso and Multiple Kernel Learning
The Journal of Machine Learning Research
Consistency of the Group Lasso and Multiple Kernel Learning
The Journal of Machine Learning Research
The Combination and Evaluation of Query Performance Prediction Methods
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Review Article: Stable feature selection for biomarker discovery
Computational Biology and Chemistry
Flu detector: tracking epidemics on twitter
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
A sparse version of the ridge logistic regression for large-scale text categorization
Pattern Recognition Letters
Automatic discovery of patterns in media content
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Compression and learning in linear regression
ISMIS'11 Proceedings of the 19th international conference on Foundations of intelligent systems
MOA-TweetReader: real-time analysis in Twitter streaming data
DS'11 Proceedings of the 14th international conference on Discovery science
Ensemble logistic regression for feature selection
PRIB'11 Proceedings of the 6th IAPR international conference on Pattern recognition in bioinformatics
Structured Variable Selection with Sparsity-Inducing Norms
The Journal of Machine Learning Research
Nowcasting Events from the Social Web with Statistical Learning
ACM Transactions on Intelligent Systems and Technology (TIST)
Stabilizing the lasso against cross-validation variability
Computational Statistics & Data Analysis
Hi-index | 0.00 |
We consider the least-square linear regression problem with regularization by the l1-norm, a problem usually referred to as the Lasso. In this paper, we present a detailed asymptotic analysis of model consistency of the Lasso. For various decays of the regularization parameter, we compute asymptotic equivalents of the probability of correct model selection (i.e., variable selection). For a specific rate decay, we show that the Lasso selects all the variables that should enter the model with probability tending to one exponentially fast, while it selects all other variables with strictly positive probability. We show that this property implies that if we run the Lasso for several bootstrapped replications of a given sample, then intersecting the supports of the Lasso bootstrap estimates leads to consistent model selection. This novel variable selection algorithm, referred to as the Bolasso, is compared favorably to other linear regression methods on synthetic data and datasets from the UCI machine learning repository.