Robust SVM-based biomarker selection with noisy mass spectrometric proteomic data

  • Authors:
  • Elena Marchiori;Connie R. Jimenez;Mikkel West-Nielsen;Niels H. H. Heegaard

  • Affiliations:
  • Department of Computer Science, Vrije Universiteit Amsterdam, The Netherlands;Department of Molecular and Cellular Neurobiology, Vrije Universiteit Amsterdam, The Netherlands;Department of Autoimmunology, Statens Serum Institut, Copenhagen, Denmark;Department of Autoimmunology, Statens Serum Institut, Copenhagen, Denmark

  • Venue:
  • EuroGP'06 Proceedings of the 2006 international conference on Applications of Evolutionary Computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Computational analysis of mass spectrometric (MS) proteomic data from sera is of potential relevance for diagnosis, prognosis, choice of therapy, and study of disease activity. To this aim, feature selection techniques based on machine learning can be applied for detecting potential biomarkes and biomaker patterns. A key issue concerns the interpretability and robustness of the output results given by such techniques. In this paper we propose a robust method for feature selection with MS proteomic data. The method consists of the sequentail application of a filter feature selection algorithm, RELIEF, followed by multiple runs of a wrapper feature selection technique based on support vector machines (SVM), where each run is obtained by changing the class label of one support vector. Frequencies of features selected over the runs are used to identify features which are robust with respect to perturbations of the data. This method is tested on a dataset produced by a specific MS technique, called MALDI-TOF MS. Two classes have been artificially generated by spiking. Moreover, the samples have been collected at different storage durations. Leave-one-out cross validation (LOOCV) applied to the resulting dataset, indicates that the proposed feature selection method is capable of identifying highly discriminatory proteomic patterns.