Comparison of feature selection and classification combinations for cancer classification using microarray data

  • Authors:
  • Vijayan Vinaya;Nadeem Bulsara;Chetan J. Gadgil;Mugdha Gadgil

  • Affiliations:
  • Department of Bioinformatics, Dr. D.Y. Patil Biotechnology and Bioinformatics Institute, Akurdi, Pune 411044, India.;Department of Bioinformatics, Dr. D.Y. Patil Biotechnology and Bioinformatics Institute, Akurdi, Pune 411044, India.;Chemical Engineering and Process Development Division, National Chemical Laboratory, CSIR, Dr. Homi Bhabha Road, Pune 411008, India.;Chemical Engineering and Process Development Division, National Chemical Laboratory, CSIR, Dr. Homi Bhabha Road, Pune 411008, India

  • Venue:
  • International Journal of Bioinformatics Research and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

High throughput gene expression data can be used to identify biomarker profiles for classification. The accuracy of microarray based sample classification depends on the algorithm employed for selecting the features (genes) used for classification, and the classification algorithm. We have evaluated the performance of over 2000 combinations of feature selection and classification algorithms in classifying cancer datasets. One of these combinations (SVM for ranking genes + SMO) shows excellent classification accuracy using a small number of genes across three cancer datasets tested. Notably, classification using 15 selected genes yields 96% accuracy for a dataset obtained on an independent microarray platform.