Study on preprocessing and classifying mass spectral raw data concerning human normal and disease cases

  • Authors:
  • Xenofon E. Floros;George M. Spyrou;Konstantinos N. Vougas;George T. Tsangaris;Konstantina S. Nikita

  • Affiliations:
  • Electrical and Computer Engineering Faculty, National Technical University of Athens, Athens, Greece;Foundation for Biomedical Research of the Academy of Athens, Athens, Greece;Foundation for Biomedical Research of the Academy of Athens, Athens, Greece;Foundation for Biomedical Research of the Academy of Athens, Athens, Greece;Electrical and Computer Engineering Faculty, National Technical University of Athens, Athens, Greece

  • Venue:
  • ISBMDA'06 Proceedings of the 7th international conference on Biological and Medical Data Analysis
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mass spectrometry is becoming an important tool in biological sciences. Tissue samples or easily obtained biological fluids (serum, plasma, urine) are analysed by a variety of mass spectrometry methods, producing spectra characterized by very high dimensionality and a high level of noise. Here we address a feature exraction method for mass spectra which consists of two main steps : In the first step an algorithm for low level preprocessing of mass spectra is applied, including denoising with the Shift-Invariant Discrete Wavelet Transform (SIDWT), smoothing, baseline correction, peak detection and normalization of the resulting peak-lists. After this step, we claim to have reduced dimensionality and redundancy of the initial mass spectra representation while keeping all the meaningful features (potential biomarkers) required for disease related proteomic patterns to be identified. In the second step, the peak-lists are alligned and fed to a Support Vector Machine (SVM) which classifies the mass spectra. This procedure was applied to SELDI-QqTOF spectral data collected from normal and ovarian cancer serum samples. The classification performance was assessed for distinct values of the parameters involved in the feature extraction pipeline. The method described here for low-level preprocessing of mass spectra results in 98.3% sensitivity, 98.3% specificity and an AUC (Area Under Curve) of 0.981 in spectra classification.