A parallel algorithm for fixed-length approximate string-matching with k-mismatches

Authors:
Maxime Crochemore;Costas S. Iliopoulos;Solon P. Pissis
Affiliations:
Dept. of Computer Science, King’s College London, London, UK;Dept. of Computer Science, King’s College London, London, UK;Dept. of Computer Science, King’s College London, London, UK
Venue:
Algorithms and Applications
Year:
2010

Citing 10
Cited 1

Fast string matching with k-differences

Journal of Computer and System Sciences - 26th IEEE Conference on Foundations of Computer Science, October 21-23, 1985
Fast parallel and serial approximate string matching

Journal of Algorithms
A space-efficient parallel sequence comparison algorithm for a message-passing multiprocessor

International Journal of Parallel Programming
Fast text searching: allowing errors

Communications of the ACM
A parallel solution to the approximate string matching problem

The Computer Journal - Special issue on formal methods: part 1
Incremental String Comparison

SIAM Journal on Computing
A fast bit-vector algorithm for approximate string matching based on dynamic programming

Journal of the ACM (JACM)
The Max-Shift Algorithm for Approximate String Matching

WAE '01 Proceedings of the 5th International Workshop on Algorithm Engineering
A Faster Algorithm for Approximate String Matching

CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
Speeding-up Hirschberg and Hunt-Szymanski LCS algorithms

Fundamenta Informaticae - Special issue on computing patterns in strings

MoTeX: A word-based HPC tool for MoTif eXtraction

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper deals with the approximate string-matching problem with Hamming distance. The approximate string-matching with k-mismatches problem is to find all locations at which a query of length m matches a factor of a text of length n with k or fewer mismatches. The approximate string-matching algorithms have both pleasing theoretical features, as well as direct applications, especially in computational biology. We consider a generalisation of this problem, the fixed-length approximate string-matching with k-mismatches problem: given a text t, a pattern x and an integer ℓ, search for all the occurrences in t of all factors of x of length ℓ with k or fewer mismatches with a factor of t. We present a practical parallel algorithm of comparable simplicity that requires only time, where w is the word size of the machine (e.g. 32 or 64 in practice) and p the number of processors. Thus the algorithm’s performance is independent of k and the alphabet size |Σ|. The proposed parallel algorithm makes use of message-passing parallelism model, and word-level parallelism for efficient approximate string-matching.