Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Adaptive set intersections, unions, and differences
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Approximate String Matching: A Simpler Faster Algorithm
SIAM Journal on Computing
WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
Pattern Matching for Spatial Point Sets
FOCS '98 Proceedings of the 39th Annual Symposium on Foundations of Computer Science
Pattern matching with address errors: rearrangement distances
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Approximate String Matching with Address Bit Errors
CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
Interchange rearrangement: The element-cost model
Theoretical Computer Science
On the cost of interchange rearrangement in strings
ESA'07 Proceedings of the 15th annual European conference on Algorithms
Efficient computations of l1and l∞rearrangement distances
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Approximate string matching with stuck address bits
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Fast set intersection and two-patterns matching
LATIN'10 Proceedings of the 9th Latin American conference on Theoretical Informatics
Deterministic length reduction: fast convolution in sparse data and applications
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
UniNovo: a universal tool for de novo peptide sequencing
RECOMB'13 Proceedings of the 17th international conference on Research in Computational Molecular Biology
Hi-index | 0.00 |
Matching a mass spectrum against a text (a key computational task in proteomics) is slow since the existing text indexing algorithms (with search time independent of the text size) are not applicable in the domain of mass spectrometry. As a result, many important applications (e.g., searches for mutated peptides) are prohibitively timeconsuming and even the standard search for non-mutated peptides is becoming too slow with recent advances in high-throughput genomics and proteomics technologies. We introduce a new paradigm - the Blocked Pattern Matching (BPM) Problem - that models peptide identification. BPM corresponds to matching a pattern against a text (over the alphabet of integers) under the assumption that each symbol a in the pattern can match a block of consecutive symbols in the text with total sum a. BPM opens a new, still unexplored, direction in combinatorial pattern matching and leads to the Mutated BPM (modeling identification of mutated peptides) and Fused BPM (modeling identification of fused peptides in tumor genomes). We illustrate how BPM algorithms solve problems that are beyond the reach of existing proteomics tools.