On maximal suffixes and constant-space linear-time versions of KMP algorithm

  • Authors:
  • Wojciech Rytter

  • Affiliations:
  • Institute of Informatics, Warsaw University, ul. Banacha 2, 02-097 Warsaw, Poland and Department of Computer Science, New Jersey Institute of Technology, University Heights, Newark

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2003

Quantified Score

Hi-index 5.23

Visualization

Abstract

Constant-space linear-time string-matching algorithms are usually very sophisticated. Most of them consist of two phases: (very technical) preprocessing phase and searching phase. An exception is one-phase Crochemore's algorithm (Theoret. Comput. Sci. 92 (1992) 33). It is an on-line version of Knuth-Morris-Pratt algorithm (KMP) with "on-the-fly" computation of pattern shifts (as approximate periods). In this paper we explore further Crochemore's approach, and construct alternative algorithms which are differently structured. In Crochemore's algorithm the approximate-period function is restarted from inside, which means that several internal variables of this function are changing globally, also Crochemore's algorithm strongly depends on the concrete implementation of approximate-periods computation. We present a simple modification of KMP algorithm which works in O(1), space, O(n) time for any function which computes periods or approximate periods in O(1)-space and linear time. The approximate-period function can be treated as a black box. We identify class of patterns, self-maximal words, which are especially well suited for Crochemore-style string matching. A new O(1)-space string-matching algorithm, MaxSuffix-Matching, is proposed in the paper, which gives yet another example of applicability of maximal suffixes.