Algorithms for approximate string matching
Information and Control
Fast text searching: allowing errors
Communications of the ACM
String searching algorithms
A fast bit-vector algorithm for approximate string matching based on dynamic programming
Journal of the ACM (JACM)
The String-to-String Correction Problem
Journal of the ACM (JACM)
A technique for computer detection and correction of spelling errors
Communications of the ACM
NR-grep: a fast and flexible pattern-matching tool
Software—Practice & Experience
Faster Bit-Parallel Approximate String Matching
CPM '02 Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching
Increased bit-parallelism for approximate and multiple string matching
Journal of Experimental Algorithmics (JEA)
High-error approximate dictionary search using estimate hash comparisons
Software—Practice & Experience
A development and application of similarity detection methods for plagiarism of online reports
ITHET'10 Proceedings of the 9th international conference on Information technology based higher education and training
Sequential pattern mining -- approaches and algorithms
ACM Computing Surveys (CSUR)
Fast Longest Common Subsequence with General Integer Scoring Support on GPUs
Proceedings of Programming Models and Applications on Multicores and Manycores
Hi-index | 0.00 |
The edit distance between strings A and B is defined as the minimum number of edit operations needed in converting A into B or vice versa. The Levenshtein edit distance allows three types of operations: an insertion, a deletion or a substitution of a character. The Damerau edit distance allows the previous three plus in addition a transposition between two adjacent characters. To our best knowledge the best current practical algorithms for computing these edit distances run in time O(dm) and O(⌈m/w⌉(n + σ)), where d is the edit distance between the two strings, m and n are their lengths (m ≤ n), w is the computer word size and σ is the size of the alphabet. In this paper we present an algorithm that runs in time O(⌈d/w⌉m + ⌈n/w⌉σ) or O(⌈d/w⌉n + ⌈m/w⌉σ). The structure of the algorithm is such, that in practice it is mostly suitable for testing whether the edit distance between two strings is within some pre-determined error threshold. We also present some initial test results with thresholded edit distance computation. In them our algorithm works faster than the original algorithm of Myers.