Faster algorithms for string matching with k mismatches

  • Authors:
  • Amihood Amir;Moshe Lewenstein;Ely Porat

  • Affiliations:
  • Department of Mathematics and Computer Science, Bar-Ilan University, 52900 Ramat-Gan, Israel and College of Computing, Georgia Institute of Technology, Atlanta, GA;Department of Mathematics and Computer Science, Bar-Ilan University, 52900 Ramat-Gan, Israel;Department of Mathematics and Computer Science, Bar-Ilan University, 52900 Ramat-Gan, Israel

  • Venue:
  • Journal of Algorithms - Special issue: SODA 2000
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The string matching with mismatches problem is that of finding the number of mismatches between a pattern P of length m and every length m substring of the text T. Currently, the fastest algorithms for this problem are the following. The Galil-Giancarlo algorithm finds all locations where the pattern has at most k errors (where k is part of the input) in time O(nk). The Abrahamson algorithm finds the number of mismatches at every location in time O(n√ m log m). We present an algorithm that is faster than both. Our algorithm finds all locations where the pattern has at most k errors in time O(n√k log k). We also show an algorithm that solves the above problem in time O((n + (nk3)/m) log k).