Bolasso: model consistent Lasso estimation through the bootstrap

Authors:
Francis R. Bach
Affiliations:
Laboratoire d'Informatique de l'Ecole Normale Supérieure, Paris, France
Venue:
Proceedings of the 25th international conference on Machine learning
Year:
2008

Citing 3
Cited 12

Bagging predictors

Machine Learning
On Model Selection Consistency of Lasso

The Journal of Machine Learning Research
Consistency of the Group Lasso and Multiple Kernel Learning

The Journal of Machine Learning Research

Consistency of the Group Lasso and Multiple Kernel Learning

The Journal of Machine Learning Research
The Combination and Evaluation of Query Performance Prediction Methods

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Review Article: Stable feature selection for biomarker discovery

Computational Biology and Chemistry
Flu detector: tracking epidemics on twitter

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
A sparse version of the ridge logistic regression for large-scale text categorization

Pattern Recognition Letters
Automatic discovery of patterns in media content

CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Compression and learning in linear regression

ISMIS'11 Proceedings of the 19th international conference on Foundations of intelligent systems
MOA-TweetReader: real-time analysis in Twitter streaming data

DS'11 Proceedings of the 14th international conference on Discovery science
Ensemble logistic regression for feature selection

PRIB'11 Proceedings of the 6th IAPR international conference on Pattern recognition in bioinformatics
Structured Variable Selection with Sparsity-Inducing Norms

The Journal of Machine Learning Research
Nowcasting Events from the Social Web with Statistical Learning

ACM Transactions on Intelligent Systems and Technology (TIST)
Stabilizing the lasso against cross-validation variability

Computational Statistics & Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the least-square linear regression problem with regularization by the l1-norm, a problem usually referred to as the Lasso. In this paper, we present a detailed asymptotic analysis of model consistency of the Lasso. For various decays of the regularization parameter, we compute asymptotic equivalents of the probability of correct model selection (i.e., variable selection). For a specific rate decay, we show that the Lasso selects all the variables that should enter the model with probability tending to one exponentially fast, while it selects all other variables with strictly positive probability. We show that this property implies that if we run the Lasso for several bootstrapped replications of a given sample, then intersecting the supports of the Lasso bootstrap estimates leads to consistent model selection. This novel variable selection algorithm, referred to as the Bolasso, is compared favorably to other linear regression methods on synthetic data and datasets from the UCI machine learning repository.