Selecting few genes for microarray gene expression classification

  • Authors:
  • Carlos J. Alonso-González;Q. Isaac Moro;Oscar J. Prieto;M. Aránzazu Simón

  • Affiliations:
  • Department of Computer Science, E.T.S.I Informática, University of Valladolid, Valladolid, Spain;Department of Computer Science, E.T.S.I Informática, University of Valladolid, Valladolid, Spain;Department of Computer Science, E.T.S.I Informática, University of Valladolid, Valladolid, Spain;Department of Computer Science, E.T.S.I Informática, University of Valladolid, Valladolid, Spain

  • Venue:
  • CAEPIA'09 Proceedings of the Current topics in artificial intelligence, and 13th conference on Spanish association for artificial intelligence
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Due to the high number of gene expressions contained on microarray data, feature extraction techniques are usually applied before inducing classifiers. A common criterion to decide on the number of selected genes is minimizing the classifier error. However, considering the risk of overfitting due to the small sample size, and the fact that the number of selected genes is usually larger than the suspected number of discriminating genes, this work proposes relaxing the minimum error rate criterion. The paper shows that from a small number of feature selection and classification methods, it is possible to find configurations that select few genes without significantly worsening the error rate of the best classifier. Average ranking for 10 to 40 genes shows that SVM-RFE with Naïve Bayes and FCBF with SVM behave consistently well.