Bayesian binary kernel probit model for microarray based cancer classification and gene selection

  • Authors:
  • Sounak Chakraborty

  • Affiliations:
  • Department of Statistics, University of Missouri-Columbia, 209F Middlebush Hall, Columbia, MO 65211-6100, USA

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2009

Quantified Score

Hi-index 0.03

Visualization

Abstract

With the arrival of gene expression microarrays a new challenge has opened up for identification or classification of cancer tissues. Due to the large number of genes providing valuable information simultaneously compared to very few available tissue samples the cancer staging or classification becomes very tricky. In this paper we introduce a hierarchical Bayesian probit model for two class cancer classification. Instead of assuming a linear structure for the function that relates the gene expressions with the cancer types we only assume that the relationship is explained by an unknown function which belongs to an abstract functional space like the reproducing kernel Hilbert space. Our formulation automatically reduces the dimension of the problem from the large number of covariates or genes to a small sample size. We incorporate a Bayesian gene selection scheme with the automatic dimension reduction to adaptively select important genes and classify cancer types under an unified model. Our model is highly flexible in terms of explaining the relationship between the cancer types and gene expression measurements and picking up the differentially expressed genes. The proposed model is successfully tested on three simulated data sets and three publicly available leukemia cancer, colon cancer, and prostate cancer real life data sets.