Algorithms for approximate string matching
Information and Control
A bit-string longest-common-subsequence algorithm
Information Processing Letters
A new approach to text searching
Communications of the ACM
Fast text searching: allowing errors
Communications of the ACM
A fast bit-vector algorithm for approximate string matching based on dynamic programming
Journal of the ACM (JACM)
Very fast and simple approximate string matching
Information Processing Letters
Efficient string matching: an aid to bibliographic search
Communications of the ACM
A guided tour to approximate string matching
ACM Computing Surveys (CSUR)
Fast and flexible string matching by combining bit-parallelism and suffix automata
Journal of Experimental Algorithmics (JEA)
Hacker's Delight
A fast and practical bit-vector algorithm for the longest common subsequence problem
Information Processing Letters
Flexible pattern matching in strings: practical on-line search algorithms for texts and biological sequences
Faster Bit-Parallel Approximate String Matching
CPM '02 Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching
Approximate Multiple Strings Search
CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
A bit-vector algorithm for computing Levenshtein and Damerau edit distances
Nordic Journal of Computing - Special issue: Selected papers of the Prague Stringology conference (PSC'02), September 23-24, 2002
Average-optimal single and multiple approximate string matching
Journal of Experimental Algorithmics (JEA)
Succinct backward-DAWG-matching
Journal of Experimental Algorithmics (JEA)
An efficient algorithm for finding gene-specific probes for DNA microarrays
ISBRA'07 Proceedings of the 3rd international conference on Bioinformatics research and applications
Bit-(parallelism)2: getting to the next level of parallelism
FUN'10 Proceedings of the 5th international conference on Fun with algorithms
Improved fast similarity search in dictionaries
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
From nondeterministic suffix automaton to lazy suffix tree
Algorithms and Applications
Approximate pattern matching with k-mismatches in packed text
Information Processing Letters
Hi-index | 0.00 |
Bit-parallelism permits executing several operations simultaneously over a set of bits or numbers stored in a single computer word. This technique permits searching for the approximate occurrences of a pattern of length m in a text of length n in time O(⌈m/w⌉n), where w is the number of bits in the computer word. Although this is asymptotically the optimal bit-parallel speedup over the basic O(mn) time algorithm, it wastes bit-parallelism's power in the common case where m is much smaller than w, since w−m bits in the computer words are unused. In this paper, we explore different ways to increase the bit-parallelism when the search pattern is short. First, we show how multiple patterns can be packed into a single computer word so as to search for all them simultaneously. Instead of spending O(rn) time to search for r patterns of length m≤w/2, we need O(⌈rm/w⌉n) time. Second, we show how the mechanism permits boosting the search for a single pattern of length m≤w/2, which can be searched for in O(⌈n/⌊w/m⌋⌉) bit-parallel steps instead of O(n). Third, we show how to extend these algorithms so that the time bounds essentially depend on k instead of m, where k is the maximum number of differences permitted. Finally, we show how the ideas can be applied to other problems such as multiple exact string matching and one-against-all computation of edit distance and longest common subsequences. Our experimental results show that the new algorithms work well in practice, obtaining significant speedups over the best existing alternatives, especially on short patterns and moderate number of differences allowed. This work fills an important gap in the field, where little work has focused on very short patterns.