Robust model selection using fast and robust bootstrap

  • Authors:
  • Matias Salibian-Barrera;Stefan Van Aelst

  • Affiliations:
  • University of British Columbia, Department of Statistics, 333-3656 Agricultural Road, Vancouver, BC, V6T 1Z4, Canada;Ghent University, Department of Applied Mathematics and Computer Science, Krijgslaan 281 S9, B-9000 Gent, Belgium

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2008

Quantified Score

Hi-index 0.03

Visualization

Abstract

Robust model selection procedures control the undue influence that outliers can have on the selection criteria by using both robust point estimators and a bounded loss function when measuring either the goodness-of-fit or the expected prediction error of each model. Furthermore, to avoid favoring over-fitting models, these two measures can be combined with a penalty term for the size of the model. The expected prediction error conditional on the observed data may be estimated using the bootstrap. However, bootstrapping robust estimators becomes extremely time consuming on moderate to high dimensional data sets. It is shown that the expected prediction error can be estimated using a very fast and robust bootstrap method, and that this approach yields a consistent model selection method that is computationally feasible even for a relatively large number of covariates. Moreover, as opposed to other bootstrap methods, this proposal avoids the numerical problems associated with the small bootstrap samples required to obtain consistent model selection criteria. The finite-sample performance of the fast and robust bootstrap model selection method is investigated through a simulation study while its feasibility and good performance on moderately large regression models are illustrated on several real data examples.