From coding theory to efficient pattern matching

  • Authors:
  • Raphaël Clifford;Klim Efremenko;Ely Porat;Amir Rothschild

  • Affiliations:
  • University of Bristol, Bristol, UK;Bar-Ilan University, Rehovot, Israel;Bar-Ilan University, Ramat-Gan, Israel;Bar-Ilan University, Ramat-Gan, Israel

  • Venue:
  • SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the classic problem of pattern matching with few mismatches in the presence of promiscuously matching wildcard symbols. Given a text t of length n and a pattern p of length m with optional wildcard symbols and a bound k, our algorithm finds all the alignments for which the pattern matches the text with Hamming distance at most k and also returns the location and identity of each mismatch. The algorithm we present is deterministic and runs in Õ(kn) time, matching the best known randomised time complexity to within logarithmic factors. The solutions we develop borrow from the tool set of algebraic coding theory and provide a new framework in which to tackle approximate pattern matching problems.