Cross-validation prior choice in Bayesian probit regression with many covariates

Authors:
D. Lamnisos;J. E. Griffin;M. F. Steel
Affiliations:
Department of Nursing and Cyprus International Institute for Environmental and Public Health, Cyprus University of Technology, Limassol, Cyprus 3603;School of Mathematics, Statistics and Actuarial Science, University of Kent, Canterbury, UK CT2 7NF;Department of Statistics, University of Warwick, Coventry, UK CV4 7AL
Venue:
Statistics and Computing
Year:
2012

Citing 5
Cited 0

Optimally combining sampling techniques for Monte Carlo rendering

SIGGRAPH '95 Proceedings of the 22nd annual conference on Computer graphics and interactive techniques
Non-parametric bootstrap recycling

Statistics and Computing
Cancer classification and prediction using logistic regression with Bayesian gene selection

Journal of Biomedical Informatics - Special issue: Biomedical machine learning
Monte Carlo Statistical Methods (Springer Texts in Statistics)

Monte Carlo Statistical Methods (Springer Texts in Statistics)
Monte Carlo Strategies in Scientific Computing

Monte Carlo Strategies in Scientific Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper examines prior choice in probit regression through a predictive cross-validation criterion. In particular, we focus on situations where the number of potential covariates is far larger than the number of observations, such as in gene expression data. Cross-validation avoids the tendency of such models to fit perfectly. We choose the scale parameter c in the standard variable selection prior as the minimizer of the log predictive score. Naive evaluation of the log predictive score requires substantial computational effort, and we investigate computationally cheaper methods using importance sampling. We find that K-fold importance densities perform best, in combination with either mixing over different values of c or with integrating over c through an auxiliary distribution.