Bit-parallel string matching under Hamming distance in O(n⌈m/w⌉) worst case time

Authors:
Szymon Grabowski;Kimmo Fredriksson
Affiliations:
Technical University of Łódź, Computer Engineering Department, Al. Politechniki 11, 90-924 Łódź, Poland;Department of Computer Science, University of Kuopio, P.O. Box 1627, 70211 Kuopio, Finland
Venue:
Information Processing Letters
Year:
2008

Citing 6
Cited 7

Improved string matching with k mismatches

ACM SIGACT News
Efficient text searching

Efficient text searching
A new approach to text searching

Communications of the ACM
A fast bit-vector algorithm for approximate string matching based on dynamic programming

Journal of the ACM (JACM)
Faster algorithms for string matching with k mismatches

SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
A guided tour to approximate string matching

ACM Computing Surveys (CSUR)

Nested Counters in Bit-Parallel String Matching

LATA '09 Proceedings of the 3rd International Conference on Language and Automata Theory and Applications
Average-optimal string matching

Journal of Discrete Algorithms
Fast bit-parallel matching for network and regular expressions

SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Pattern matching in the Hamming distance with thresholds

Information Processing Letters
From nondeterministic suffix automaton to lazy suffix tree

Algorithms and Applications
Exploiting word-level parallelism for fast convolutions and their applications in approximate string matching

European Journal of Combinatorics
Approximate pattern matching with k-mismatches in packed text

Information Processing Letters

Quantified Score

Hi-index	0.89

Visualization

Abstract

Given two strings, a pattern P of length m and a text T of length n over some alphabet @S, we consider the string matching problem under k mismatches. The well-known Shift-Add algorithm [R.A. Baeza-Yates, G.H. Gonnet, A new approach to text searching, Comm. ACM 35 (10) (1992) 74-82] solves the problem in O(n@?mlog(k)/w@?) worst case time, where w is the number of bits in a computer word. We present two algorithms that improve this result to O(n@?mloglog(k)/w@?) and O(n@?m/w@?), respectively. The algorithms make use of nested varying length bit-strings, that represent the search state. We call these Matryoshka counters. The techniques we developed are of more general use for string matching problems.