Mining whole-sample mass spectrometry proteomics data for biomarkers - An overview

  • Authors:
  • Ross A. McDonald;Paul Skipp;Julia Bennell;Chris Potts;Lyn Thomas;C. David O'Connor

  • Affiliations:
  • Centre for Proteomic Research and School of Biological Sciences, University of Southampton, Biomedical Sciences Building, Bassett Crescent East, Southampton SO16 7PX, UK and Centre for Operational ...;Centre for Proteomic Research and School of Biological Sciences, University of Southampton, Biomedical Sciences Building, Bassett Crescent East, Southampton SO16 7PX, UK;Centre for Operational Research, Management Science and Information Systems, University of Southampton, UK;Centre for Operational Research, Management Science and Information Systems, University of Southampton, UK;Centre for Operational Research, Management Science and Information Systems, University of Southampton, UK;Centre for Proteomic Research and School of Biological Sciences, University of Southampton, Biomedical Sciences Building, Bassett Crescent East, Southampton SO16 7PX, UK

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2009

Quantified Score

Hi-index 12.05

Visualization

Abstract

Biomarkers are proteins or other components of a clinical sample whose measured intensity alters in response to a biological change such as an infection or disease, and which may therefore be useful for prediction and diagnosis. Proteomics is the science of discovering, identifying and understanding such components using tools such as mass spectrometry. In this paper we aim to provide a concise overview of designing and conducting an MS proteomics study in such a way as to allow statistical analysis that may lead to the discovery of novel markers. We provide a summary of the various stages that make up such an experiment, highlighting the need for experimental goals to be decided upon in advance. We discuss issues in experimental design at the sample collection stage, and good practice for standardising protocols within the proteomics laboratory. We then describe approaches to the data mining stage of the experiment, including the processing steps that transform a raw mass spectrum into a useable form. We propose a permutation-based procedure for determining the significance of reported error rates. Finally, because of its advantage in speed and low cost, we suggest that MS proteomics may be a good candidate for an early primary screening approach to disease diagnosis, identifying areas of risk and making referrals for more specific tests without necessarily making a diagnosis in its own right. Our discussion is illustrated with examples drawn from experiments on bovine blood serum designed to pinpoint novel biomarkers for bovine tuberculosis.