Generalized function matching

  • Authors:
  • Amihood Amir;Igor Nor

  • Affiliations:
  • Department of Computer Science, Bar-Ilan University, Ramat-Gan 52900, Israel and College of Computing, Georgia Tech, Atlanta, GA 30332-0280, USA;Department of Computer Science, Bar-Ilan University, Ramat-Gan 52900, Israel

  • Venue:
  • Journal of Discrete Algorithms
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present problems in different application areas: tandem repeats (computational biology), poetry and music analysis, and author validation, that require a more sophisticated pattern matching model that hitherto considered. We introduce a new matching criterion-generalized function matching-that encapsulates the notion suggested by the above problems. The generalized function matching problem has as its input a text T of length n over alphabet @S"T@?{@f} and a pattern P=P[0]P[1]...P[m-1] of length m over alphabet @S"P@?{@f}. We seek all text locations i where the prefix of the substring that starts at i is equal to f(P[0])f(P[1])...f(P[m-1]), for some function f:@S"P-@S"T^*. We give a polynomial time algorithm for the generalized pattern matching problem over bounded alphabets. We identify in this problem an interesting phenomenon that has been rare in pattern matching. One where the complexity of the naive solution is a polynomial with the alphabet size in the exponent. This causes a significant complexity difference between the bounded alphabet and infinite alphabet case. We prove that the generalized pattern matching problem over infinite alphabets is NP-hard.