Selecting differentially expressed genes using minimum probability of classification error

  • Authors:
  • Pritha Mahata;Kaushik Mahata

  • Affiliations:
  • School of Electrical Engineering and Computer Science, The University of Newcastle, Callaghan, NSW 2308, Australia;School of Electrical Engineering and Computer Science, The University of Newcastle, Callaghan, NSW 2308, Australia

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Discovery of differentially expressed genes between normal and diseased patients is a central research problem in bioinformatics. It is specially important to find few genetic markers which can be explored for diagnostic purposes. The performance of a set of markers is often measured by the associated classification accuracy. This motivates our ranking of genes depending on the minimum probability of classification errors (MPE) for each gene. In this work, we use Bayesian decision-making algorithm to compute MPE. A quantile-based probability density estimation technique is used for generating probability density functions of genes. The method is tested on three datasets: colon cancer, leukaemia, and hereditary breast cancer. The quality of the selected markers is evaluated by the classification accuracy obtained using support-vector-machine and a modified naive Bayes classifier. We obtain 96.77% accuracy in colon cancer and 97.06% accuracy in leukaemia, using only five genes in each case. Finally, using just three genes we get 100% accuracy in hereditary breast cancer. We also compare our results with those using the genes ranked by p-value and show that the genes ranked by MPE perform better or equal to those ranked by p-value.