MaskedPainter: Feature selection for microarray data analysis

  • Authors:
  • Daniele Apiletti;Elena Baralis;Giulia Bruno;Alessandro Fiori

  • Affiliations:
  • Dipartimento di Automatica e Informatica, Politecnico di Torino, Torino, Italy;Dipartimento di Automatica e Informatica, Politecnico di Torino, Torino, Italy;Dipartimento di Automatica e Informatica, Politecnico di Torino, Torino, Italy;Dipartimento di Automatica e Informatica, Politecnico di Torino, Torino, Italy

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Selecting a small number of discriminative genes from thousands is a fundamental task in microarray data analysis. An effective feature selection allows biologists to investigate only a subset of genes instead of the entire set, thus avoiding insignificant, noisy, and redundant features. This paper presents the MaskedPainter feature selection method for gene expression data. The proposed method measures the ability of each gene to classify samples belonging to different classes and ranks genes by computing an overlap score. A density based technique is exploited to smooth the effects of outliers in the overlap score computation. Analogously to other approaches, the number of selected genes can be set by the user. However, our algorithm may automatically detect the minimum set of genes that yields the best classification coverage of training set samples. The effectiveness of our approach has been demonstrated through an empirical study on public microarray datasets with different characteristics. Experimental results show that the proposed approach yields a higher classification accuracy with respect to widely used feature selection techniques.