Sparse and stable gene selection with consensus SVM-RFE

  • Authors:
  • E. Tapia;P. Bulacio;L. Angelone

  • Affiliations:
  • CIFASIS - Conicet, Centro Internacional Franco Argentino de Ciencias de la Informacion y de Sistemas, 27 de Febrero 210 bis, S2000EZP Rosario, Argentina and Facultad de Ciencias Exactas, Ingenier& ...;CIFASIS - Conicet, Centro Internacional Franco Argentino de Ciencias de la Informacion y de Sistemas, 27 de Febrero 210 bis, S2000EZP Rosario, Argentina and Facultad de Ciencias Exactas, Ingenier& ...;CIFASIS - Conicet, Centro Internacional Franco Argentino de Ciencias de la Informacion y de Sistemas, 27 de Febrero 210 bis, S2000EZP Rosario, Argentina and Facultad de Ciencias Exactas, Ingenier& ...

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2012

Quantified Score

Hi-index 0.10

Visualization

Abstract

A method is described for performing sparse and stable gene selection from a number of unstable, but low cost, SVM-RFE units referred to as SVM-RFE subunits. Using a comprehensive simulation study, we show that the introduction of a consensus constraint with respect to variations in the policy of gene removal and a stability constraint with respect to perturbations in the training data can remarkably improve gene selection precision, dimensionality reduction ratio and stability of low cost SVM-RFE subunits still guaranteeing affordable computational costs. The method, which does not require the preselection of the number of selected genes, is divided into two stages. Multiple rough gene removal policies are first applied to multiple surrogate training datasets (spreading). Multiple consensus gene sets with respect to variations in the gene removal policy are then obtained and passed through a stability filter which selects the best performing gene set (despreading). Hence, while the consensus constraint performs strong dimensionality reduction at affordable computational costs, the stability constraint ensures acceptable indexes of gene selection stability and further dimensionality reduction. The method is validated on three benchmark microarray datasets.