A database search algorithm for identification of peptides with multiple charges using tandem mass spectrometry

  • Authors:
  • Kang Ning;Ket Fah Chong;Hon Wai Leong

  • Affiliations:
  • Department of Computer Science, National University of Singapore, Singapore;Department of Computer Science, National University of Singapore, Singapore;Department of Computer Science, National University of Singapore, Singapore

  • Venue:
  • BioDM'06 Proceedings of the 2006 international conference on Data Mining for Biomedical Applications
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Peptide sequencing using tandem mass spectrometry is the process of interpreting the peptide sequence from a given mass spectrum. Peptide sequencing is an important but challenging problem in bioinformatics. The advancement in mass spectrometry machines has yielded great amount of high quality spectra data, but the methods to analyze these spectra to get peptide sequences are still accurate. There are two types of peptide sequencing methods –database search methods and the de novo methods. Much progress has been made, but the accuracy and efficiency of these methods are not satisfactory and improvements are urgently needed. In this paper, we will introduce a database search algorithm for sequencing of peptides using tandem mass spectrometry. This Peptide Sequence Pattern (PSP) algorithm first generates the peptide sequence patterns (PSPs) by connecting the strong tags with mass differences. Then a linear time database search process is used to search for candidate peptide sequences by PSPs, and the candidate peptide sequences are then scored by share peaks count. The PSP algorithm is designed for peptide sequencing from spectra with multiple charges, but it is also applicable for singly charged spectra. Experiments have shown that our algorithm can obtain better sequencing results than current database search algorithms for many multiply charged spectra, and comparative results for singly charged spectra against other algorithms.