Charge state determination of peptide tandem mass spectra using support vector machine (SVM)

  • Authors:
  • An-Min Zou;Jinhong Shi;Jiarui Ding;Fang-Xiang Wu

  • Affiliations:
  • Department of Aerospace Engineering, Ryerson University, Toronto, ON, Canada and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, Canada;Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada;Department of Computer Science, University of British Columbia, Vancouver, BC, Canada and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, Canada;Department of Mechanical Engineering, and Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada

  • Venue:
  • IEEE Transactions on Information Technology in Biomedicine - Special section on new and emerging technologies in bioinformatics and bioengineering
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

A single mass spectrometry experiment could produce hundreds of thousands of tandemmass spectra. Several search engines have been developed to interpret tandem mass spectra. All search engines need to determine the masses of peptide ions from theirmass/charge ratios.Unfortunately,mass spectrometers do not detect the charges of ions. A current strategy is to search candidate peptides multiple times, once for each possible charge state (typically +2 or +3). However, this strategy not only wastes the search time, but also increases the risk of false positive peptide identification. This paper aims at discriminating doubly charged spectra from triply charged ones. Twenty-eight features are introduced to describe the discriminant characteristics of doubly charged and triply charged spectra. The support vector machine (SVM) technique is used to train the classifier on these 28 features. To verify the proposed method, computational experiments are conducted on two types of datasets: ISB dataset generated from the low-resolution ion-trap instrument and TOV dataset generated from the high-resolution quadrupole-time-of-flight instrument. For each type of dataset, the SVM-based classifiers are trained and tested on 20 randomly sampled subdatasets. The results show that the proposed method reaches average correct rates of 95% and 93% to discriminate doubly charged spectra from triply charged ones for the low-resolution ISB dataset and the high-resolution TOV dataset, respectively.