Mutation-tolerant protein identification by mass-spectrometry
RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
Efficient string matching: an aid to bibliographic search
Communications of the ACM
Fast and Sensitive Alignment of Large Genomic Sequences
CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
An effective algorithm for the peptide de novo sequencing from MS/MS spectrum
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Hi-index | 0.00 |
Tandem mass spectrometry (MS/MS) is the most important method for the peptide and protein identification. One approach to interpret the MS/MS data is de novo sequencing, which is becoming more and more accurate and important. However De novo sequencing usually can only confidently determine partial sequences, while the undetermined parts are represented by “mass gaps”. We call such a partially determined sequence a gapped sequence tag. When a gapped sequence tag is searched in a database for protein identification, the determined parts should match the database sequence exactly, while each mass gap should match a substring of amino acids whose masses total up to the value of the mass gap. In such a case, the standard string matching algorithm does not work any more. In this paper, we present a new efficient algorithm to find the matches of gapped sequence tags in a protein database.