A graph approach to the threshold all-against-all substring matching problem

Authors:
Marina Barsky;Ulrike Stege;Alex Thomo;Chris Upton
Affiliations:
University of Victoria, Victoria, BC, Canada;University of Victoria, Victoria, BC, Canada;University of Victoria, Victoria, BC, Canada;University of Victoria, Victoria, BC, Canada
Venue:
Journal of Experimental Algorithmics (JEA)
Year:
2008

Citing 11
Cited 0

Algorithms for approximate string matching

Information and Control
Sparse dynamic programming I: linear cost functions

Journal of the ACM (JACM)
Sparse dynamic programming II: convex and concave cost functions

Journal of the ACM (JACM)
Unsupervised Learning of Multiple Motifs in Biopolymers Using Expectation Maximization

Machine Learning - Special issue on applications in molecular biology
Algorithms on strings, trees, and sequences: computer science and computational biology

Algorithms on strings, trees, and sequences: computer science and computational biology
A branch-and-cut algorithm for multiple sequence alignment

RECOMB '97 Proceedings of the first annual international conference on Computational molecular biology
q-gram based database searching using a suffix array (QUASAR)

RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
The Maximum Weight Trace Problem in Multiple Sequence Alignment

CPM '93 Proceedings of the 4th Annual Symposium on Combinatorial Pattern Matching
A Fast Algorithm on Average for All-Against-All Sequence Matching

SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
A new algorithm for fast all-against-all substring matching

SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Efficient q-gram filters for finding all ε-matches over a given length

RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a novel graph model and an efficient algorithm for solving the “threshold all against all” problem, which involves searching two strings (with length M and N, respectively) for all maximal approximate substring matches of length at least S, with up to K differences. Our algorithm solves the problem in time O(MNK3), which is a considerable improvement over the previous known bound for this problem. We also provide experimental evidence that, in practice, our algorithm exhibits a better performance than its worst-case running time.