Approximate string-matching with q-grams and maximal matches
Theoretical Computer Science - Selected papers of the Combinatorial Pattern Matching School
Sparse dynamic programming I: linear cost functions
Journal of the ACM (JACM)
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Alignments without low-scoring regions
RECOMB '98 Proceedings of the second annual international conference on Computational molecular biology
q-gram based database searching using a suffix array (QUASAR)
RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
A fast bit-vector algorithm for approximate string matching based on dynamic programming
Journal of the ACM (JACM)
FLASH: A Fast Look-Up Algorithm for String Homology
Proceedings of the 1st International Conference on Intelligent Systems for Molecular Biology
Better Filtering with Gapped q-Grams
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
A graph approach to the threshold all-against-all substring matching problem
Journal of Experimental Algorithmics (JEA)
Lossless filter for multiple repetitions with Hamming distance
Journal of Discrete Algorithms
Filtering bio-sequence based on sequence descriptor
BioDM'06 Proceedings of the 2006 international conference on Data Mining for Biomedical Applications
Hi-index | 0.00 |
Fast and exact comparison of large genomic sequences remains a challenging task in biosequence analysis. We consider the problem of finding all ε-matches between two sequences, i.e. all local alignments over a given length with an error rate of at most ε. We study this problem theoretically, giving an efficient q-gram filter for solving it. Two applications of the filter are also discussed, in particular genomic sequence assembly and blast-like sequence comparison. Our results show that the method is 25 times faster than blast, while not being heuristic.