Linear time algorithm for the generalised longest common repeat problem

  • Authors:
  • Inbok Lee;Yoan José Pinzón Ardila

  • Affiliations:
  • Dept. of Computer Science, King's College London, London, United Kingdom;Dept. of Computer Science, King's College London, London, United Kingdom

  • Venue:
  • SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given a set of strings $\mathcal{U} = \{T_{1}, T_{2}, . . . , T_{\ell}\}$, the longest common repeat problem is to find the longest common substring that appears at least twice in each string of $\mathcal{U}$, considering direct, inverted, mirror as well as everted repeats. In this paper we define the generalised longest common repeat problem, where we can set the number of times that a repeat should appear in each string. We present a linear time algorithm for this problem using the suffix array. We also show an application of our algorithm for finding a longest common substring which appears only in a subset $\mathcal{U}^{\prime}$ of $\mathcal{U}$ but not in $\mathcal{U}$-$\mathcal{U}^{\prime}$.