Complexity of Sequential Pattern Matching Algorithms

  • Authors:
  • Mireille Régnier;Wojciech Szpankowski

  • Affiliations:
  • -;-

  • Venue:
  • RANDOM '98 Proceedings of the Second International Workshop on Randomization and Approximation Techniques in Computer Science
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

We formally define a class of sequential pattern matching algorithms that includes all variations of Morris-Pratt algorithm. For the last twenty years it was known that the complexity of such algorithms is bounded by a linear function of the text length. Recently, substantial progress has been made in identifying lower bounds. We now prove there exists asymptotically a linearity constant for the worst and the average cases. We use Subadditive Ergodic Theorem and prove an almost sure convergence. Our results hold for any given pattern and text and for stationary ergodic pattern and text. In the course of the proof, we establish some structural property, namely, the existence of "unavoidable positions" where the algorithm must stop to compare. This property seems to be uniquely reserved for Morris-Pratt type algorithms (e.g., Boyer and Moore algorithm does not possess this property).