Bit-parallel string matching under Hamming distance in O(n⌈m/w⌉) worst case time

  • Authors:
  • Szymon Grabowski;Kimmo Fredriksson

  • Affiliations:
  • Technical University of Łódź, Computer Engineering Department, Al. Politechniki 11, 90-924 Łódź, Poland;Department of Computer Science, University of Kuopio, P.O. Box 1627, 70211 Kuopio, Finland

  • Venue:
  • Information Processing Letters
  • Year:
  • 2008

Quantified Score

Hi-index 0.89

Visualization

Abstract

Given two strings, a pattern P of length m and a text T of length n over some alphabet @S, we consider the string matching problem under k mismatches. The well-known Shift-Add algorithm [R.A. Baeza-Yates, G.H. Gonnet, A new approach to text searching, Comm. ACM 35 (10) (1992) 74-82] solves the problem in O(n@?mlog(k)/w@?) worst case time, where w is the number of bits in a computer word. We present two algorithms that improve this result to O(n@?mloglog(k)/w@?) and O(n@?m/w@?), respectively. The algorithms make use of nested varying length bit-strings, that represent the search state. We call these Matryoshka counters. The techniques we developed are of more general use for string matching problems.