Wavelet selection for disease classification by DNA microarray data

  • Authors:
  • Loris Nanni;Alessandra Lumini

  • Affiliations:
  • DEIS, IEIIT - CNR, Universití di Bologna, Viale Risorgimento 2, 40136 Bologna, Italy;DEIS, IEIIT - CNR, Universití di Bologna, Viale Risorgimento 2, 40136 Bologna, Italy

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2011

Quantified Score

Hi-index 12.05

Visualization

Abstract

The microarrays report the measures of the expression levels of tens of thousands of genes, this high dimensional feature vector contains also irrelevant information for accurate classification. Moreover, only few training samples are available, hence for avoiding the curse of dimensionality problem a feature reduction should be performed before the classification step. Here, we proposed a set of orthogonal wavelet detail coefficients of different wavelet mothers to extract the features from the microarray data. We propose to use a multi-classifiers where each classifier, a support vector machine, is trained using a different set of detail coefficients, the classifiers are combined by ''sum rule''. The detail coefficients set selection is performed by running Sequential Forward Floating Selection (SFFS). The goodness of the proposed method is validated using the area under the ROC curve as performance indicator, the experiments are carried out on four-datasets: Breast dataset; Ovarian dataset; Lung dataset; Prostate dataset. The results show that the proposed method outperforms the performance that can be obtained by a single set of detail coefficients. Moreover, we have shown that, also using as features the detail coefficients, a random subspace of classifiers outperforms the stand-alone classifiers.