Approximate parameterized matching

  • Authors:
  • Carmit Hazay;Moshe Lewenstein;Dina Sokol

  • Affiliations:
  • Bar-Ilan University, Ramat Gan, Israel;Bar-Ilan University, Ramat Gan, Israel;Brooklyn College of the City University of New York

  • Venue:
  • ACM Transactions on Algorithms (TALG)
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Two equal length strings s and s′, over alphabets Σs and Σs′, parameterize match if there exists a bijection π : Σs → Σs′ such that π (s) = s′, where π (s) is the renaming of each character of s via π. Parameterized matching is the problem of finding all parameterized matches of a pattern string p in a text t, and approximate parameterized matching is the problem of finding at each location a bijection π that maximizes the number of characters that are mapped from p to the appropriate |p|-length substring of t. Parameterized matching was introduced as a model for software duplication detection in software maintenance systems and also has applications in image processing and computational biology. For example, approximate parameterized matching models image searching with variable color maps in the presence of errors. We consider the problem for which an error threshold, k, is given, and the goal is to find all locations in t for which there exists a bijection π which maps p into the appropriate |p|-length substring of t with at most k mismatched mapped elements. Our main result is an algorithm for this problem with O(nk1.5 + mk log m) time complexity, where m = |p| and n=|t|. We also show that when |p| = |t| = m, the problem is equivalent to the maximum matching problem on graphs, yielding a O(m + k1.5) solution.