A new genetic algorithm in proteomics: Feature selection for SELDI-TOF data

  • Authors:
  • Christelle Reynès;Robert Sabatier;Nicolas Molinari;Sylvain Lehmann

  • Affiliations:
  • Laboratoire Physique Industrielle et Traitement de l'Information, EA 2415, Faculté de Pharmacie, 15 av. Charles Flahault, BP 14491, 34093 Montpellier Cedex 5, France;Laboratoire Physique Industrielle et Traitement de l'Information, EA 2415, Faculté de Pharmacie, 15 av. Charles Flahault, BP 14491, 34093 Montpellier Cedex 5, France;Laboratoire de Biostatistique, EA 2415, Institut Universitaire de Recherche Clinique, 641 av. G. Giraud, 34093 Montpellier, France;Institut de Génétique Humaine du CNRS, UPR 1142, 141, rue de la Cardonille, 34396 Montpellier Cedex 5, France

  • Venue:
  • Computational Statistics & Data Analysis
  • Year:
  • 2008

Quantified Score

Hi-index 0.03

Visualization

Abstract

Mass spectrometry from clinical specimens is used in order to identify biomarkers in a diagnosis. Thus, a reliable method for both feature selection and classification is required. A novel method is proposed to find biomarkers in SELDI-TOF in order to perform robust classification.The feature selection is based on a new genetic algorithm. Concerning the classification, a method which takes into account the great variability on intensity by using decision stumps has been developed. Moreover, as the samples are often small, it is more appropriate to use the decision stumps simultaneously than building a complete tree. The thresholds of the decision stumps are determined in the same genetic algorithm. Finally, the method was generalized to more than two groups based on pairwise coupling. The obtained algorithm was applied on two data sets: a publicly available one containing two groups allowing a comparison with other methods from the literature and a new one containing three groups.