Proteomic mass spectra classification using decision tree based ensemble methods

  • Authors:
  • Pierre Geurts;Marianne Fillet;Dominique De Seny;Marie-Alice Meuwis;Michel Malaise;Marie-Paule Merville;Louis Wehenkel

  • Affiliations:
  • Department of Electrical Engineering and Computer Science, University of Liège 4000 Liège, Belgium;Laboratory of Clinical Chemistry and Rheumatology, CBIG---Centre of Biomedical Integrative Genoproteomics, University of Liège 4000 Liège, Belgium;Laboratory of Clinical Chemistry and Rheumatology, CBIG---Centre of Biomedical Integrative Genoproteomics, University of Liège 4000 Liège, Belgium;Laboratory of Clinical Chemistry and Rheumatology, CBIG---Centre of Biomedical Integrative Genoproteomics, University of Liège 4000 Liège, Belgium;Laboratory of Clinical Chemistry and Rheumatology, CBIG---Centre of Biomedical Integrative Genoproteomics, University of Liège 4000 Liège, Belgium;Laboratory of Clinical Chemistry and Rheumatology, CBIG---Centre of Biomedical Integrative Genoproteomics, University of Liège 4000 Liège, Belgium;Department of Electrical Engineering and Computer Science, University of Liège 4000 Liège, Belgium

  • Venue:
  • Bioinformatics
  • Year:
  • 2005

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: Modern mass spectrometry allows the determination of proteomic fingerprints of body fluids like serum, saliva or urine. These measurements can be used in many medical applications in order to diagnose the current state or predict the evolution of a disease. Recent developments in machine learning allow one to exploit such datasets, characterized by small numbers of very high-dimensional samples. Results: We propose a systematic approach based on decision tree ensemble methods, which is used to automatically determine proteomic biomarkers and predictive models. The approach is validated on two datasets of surface-enhanced laser desorption/ionization time of flight measurements, for the diagnosis of rheumatoid arthritis and inflammatory bowel diseases. The results suggest that the methodology can handle a broad class of similar problems. Supplementary information: Additional tables, appendicies and datasets may be found at http://www.montefiore.ulg.ac.be/~geurts/Papers/Proteomic-suppl.html Contact: p.geurts@ulg.ac.be