SPIDER: Software for Protein Identification from Sequence Tags with De Novo Sequencing Error

Authors:
Yonghua Han;Bin Ma;Kaizhong Zhang
Affiliations:
University of Western Ontario;University of Western Ontario;University of Western Ontario
Venue:
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Year:
2004

Citing 1
Cited 4

An effective algorithm for the peptide de novo sequencing from MS/MS spectrum

CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching

Identification of Post-Translational Modifications via Blind Search of Mass-Spectra

CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
Classifying b and y ions in peptide tandem mass spectra

FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 5
A database search algorithm for identification of peptides with multiple charges using tandem mass spectrometry

BioDM'06 Proceedings of the 2006 international conference on Data Mining for Biomedical Applications
EigenMS: de novo analysis of peptide tandem mass spectra by spectral graph partitioning

RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology

Quantified Score

Hi-index	0.00

Visualization

Abstract

For the identification of novel proteins using MS/MS, de novo sequencing software computes one or several possible amino acid sequences (called sequence tags) for each MS/MS spectrum. Those tags are then used to match, accounting amino acid mutations, the sequences in a protein database. If the de novo sequencing gives correct tags, the homologs of the proteins can be identified by this approach and software such as MS-BLAST is available for the matching. However, de novo sequencing very often gives only partially correct tags. The most common error is that a segment of amino acids is replaced by another segment with approximately the same masses. We developed a new efficient algorithm to match sequence tags with errors to database sequences for the purpose of protein and peptide identification. A software package, SPIDER, was developed and made available on Internet for free public use. This paper describes the algorithms and features of the SPIDER software.