A unified approach to word occurrence probabilities
Discrete Applied Mathematics - Special volume on combinatorial molecular biology
WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
A Statistical Model of Proteolytic Digestion
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Proceedings of the 2005 ACM symposium on Applied computing
Mass spectra alignments and their significance
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
A hidden markov model based scoring function for mass spectrometry database search
RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology
Probabilistic Arithmetic Automata and Their Application to Pattern Matching Statistics
CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
Probabilistic Arithmetic Automata and Their Applications
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 0.00 |
Peptide mass fingerprinting is a technique to identify a protein from its fragment masses obtained by mass spectrometry after enzymatic digestion. Recently, much attention has been given to the question of how to evaluate the significance of identifications; results have been developed mostly from a combinatorial perspective. In particular, existing methods generally do not capture the fact that the same amino acid can have different masses because of, e.g., isotopic distributions or variable chemical modifications. We offer several new contributions to the field: We introduce probabilistically weighted alphabets, where each character can have different masses according to a probability distribution, and random weighted strings as a fundamental model for random proteins. We develop a general computational framework, Markov Additive Chains, for various statistics of cleavage fragments of random proteins, and obtain general formulas for these statistics. Special results are given for so-called standard cleavage schemes (e.g., Trypsin). Computational results are provided, as well as a comparison to proteins from the SwissProt database.