An Online Algorithm for Finding the Longest Previous Factors

  • Authors:
  • Daisuke Okanohara;Kunihiko Sadakane

  • Affiliations:
  • Department of Computer Science, University of Tokyo, Tokyo, Japan 113-0013;Department of Computer Science and Communication Engineering, Kyushu University, Fukuoka, Japan 819-0395

  • Venue:
  • ESA '08 Proceedings of the 16th annual European symposium on Algorithms
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a novel algorithm for finding the longest factors in a text, for which the working space is proportional to the history text size. Moreover, our algorithm is online and exact; in that, unlike the previous batch algorithms [4, 5, 6, 7, 14], which needs to read the entire input beforehand, our algorithm reports the longest match just after reading each character. This algorithm can be directly used for data compression, pattern analysis, and data mining. Our algorithm also supports the window buffer, in that we can bound the working space by discarding the history from the oldest character. Using the dynamic rank/select dictionary [17], our algorithm requires nlog茂戮驴+ O(nlog茂戮驴) + O(n) bits of working space, and O(log3n) time per character, O(nlog3n) total time, nis the length of the history, and 茂戮驴is the alphabet size. We implemented our algorithm and compared it with the recent algorithms [4, 5, 14] in terms of speed and the working space. We found that our algorithm can work with a smaller working space, less than 1/2 of those for the previous methods in real-world data, and with a reasonable decline in speed.