Beating the noise: new statistical methods for detecting signals in MALDI-TOF spectra below noise level

Authors:
Tim O. F. Conrad;Alexander Leichtle;Andre Hagehülsmann;Elmar Diederichs;Sven Baumann;Joachim Thiery;Christof Schütte
Affiliations:
Department of Mathematics, Free University Berlin, Germany;Institute of Laboratory Medicine, Clinical Chemistry and Molecular Diagnostics, University Hospital Leipzig, Germany;Microsoft Research, Cambridge, UK;Department of Mathematics, Free University Berlin, Germany;Institute of Laboratory Medicine, Clinical Chemistry and Molecular Diagnostics, University Hospital Leipzig, Germany;Institute of Laboratory Medicine, Clinical Chemistry and Molecular Diagnostics, University Hospital Leipzig, Germany;Department of Mathematics, Free University Berlin, Germany
Venue:
CompLife'06 Proceedings of the Second international conference on Computational Life Sciences
Year:
2006

Citing 6
Cited 0

Detection of Signals in Noise

Detection of Signals in Noise
Geometric Hashing: An Overview

IEEE Computational Science & Engineering
Efficient greedy learning of Gaussian mixture models

Neural Computation
Predicting Molecular Formulas of Fragment Ions with Isotope Patterns in Tandem Mass Spectra

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Sample classification from protein mass spectrometry, by 'peak probability contrasts'

Bioinformatics
Data mining techniques for cancer detection using serum proteomic profiling

Artificial Intelligence in Medicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

Background: The computer-assisted detection of small molecules by mass spectrometry in biological samples provides a snapshot of thousands of peptides, protein fragments and proteins in biological samples. This new analytical technology has the potential to identify disease associated proteomic patterns in blood serum. However, the presently available bioinformatic tools are not sensitive enough to identify clinically important low abundant proteins as hormons or tumor markers with only low blood concentrations. Aim: Find, analyze and compare serum proteom patterns in groups of human subjects having different properties such as disease status with a new workflow to enhance sensitivity and specificity. Problems: Mass data acquired from high-throughput platforms frequently are blurred and noisy. This complicates the reliable identification of peaks in general and very small peaks even below noise level in particular. However, this statement is only valid for single or few spectra. If the algorithm has access to a large number of spectra (e.g. N 1000), new possibilities arise, one of such being a statistical approach. Approach: Apply signal preprocessing steps followed by statistical analyses of the blurred data and the region below the typical noise threshold to identify signals usually hidden below this “barrier”. Results: A new analysis workflow has been developed that is able to accurately identify, analyze and determine peaks and their parameters even below noise level which other tools can not detect. A Comparison to commercial software has clearly proven this gain in sensitivity. These additional peaks can be used in subsequent steps to build better peak patterns for proteomic pattern analysis. We belive that this new approach will foster identification of new biomarkers having not been detectable by most algorithms currently available.