Bayesian variable selection for disease classification using gene expression data

Authors:
Yang Ai-Jun;Song Xin-Yuan
Affiliations:
-;-
Venue:
Bioinformatics
Year:
2010

Citing 0
Cited 3

Variable selection in model-based discriminant analysis

Journal of Multivariate Analysis
A study of variable selection using g-prior distribution with ridge parameter

Computational Statistics & Data Analysis
DNA microarray SNP associations with clinical efficacy and side effects of domperidone treatment for gastroparesis

Journal of Biomedical Informatics

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: An important application of gene expression microarray data is the classification of samples into categories. Accurate classification depends upon the method used to identify the most relevant genes. Owing to the large number of genes and relatively small sample size, the selection process can be unstable. Modification of existing methods for achieving better analysis of microarray data is needed. Results: We propose a Bayesian stochastic variable selection approach for gene selection based on a probit regression model with a generalized singular g-prior distribution for regression coefficients. Using simulation-based Markov chain Monte Carlo methods for simulating parameters from the posterior distribution, an efficient and dependable algorithm is implemented. It is also shown that this algorithm is robust to the choices of initial values, and produces posterior probabilities of related genes for biological interpretation. The performance of the proposed approach is compared with other popular methods in gene selection and classification via the well-known colon cancer and leukemia datasets in microarray literature. Availability: A free Matlab code to perform gene selection is available at http://www.sta.cuhk.edu.hk/xysong/geneselection/. Contact:ajyang81@gmail.com; xysong@sta.cuhk.edu.hk. Supplementary information:Supplementary data are available at Bioinformatics online.