Efficient algorithms for three variants of the LPF table

  • Authors:
  • Maxime Crochemore;Costas S. Iliopoulos;Marcin Kubica;Wojciech Rytter;Tomasz Waleń

  • Affiliations:
  • Kings College London, London WC2R 2LS, UK and Université Paris-Est, France;Kings College London, London WC2R 2LS, UK and Digital Ecosystems & Business Intelligence Institute, Curtin University of Technology, Perth WA 6845, Australia;Institute of Informatics, University of Warsaw, Warsaw, Poland;Institute of Informatics, University of Warsaw, Warsaw, Poland and Department of Mathematics and Informatics, Copernicus University, Toruń, Poland;Institute of Informatics, University of Warsaw, Warsaw, Poland

  • Venue:
  • Journal of Discrete Algorithms
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The concept of a longest previous factor (LPF) is inherent to Ziv-Lempel factorization of strings in text compression, as well as in statistics of repetitions and symmetries. It is expressed in the form of a table - LPF[i] is the maximum length of a factor starting at position i, that also appears earlier in the given text. We show how to compute efficiently three new tables storing different variants of previous factors (past segments) of a string. The longest previous non-overlapping factor, for a given position i, is the longest factor starting at i which has an exact copy occurring entirely before, while the longest previous non-overlapping reverse factor for a given position i is the longest factor starting at i, such that its reverse copy occurs entirely before. In both problems the previous copies of the factors are required to occur within the prefix ending at position i-1. The longest previous (possibly overlapping) reverse factor is the longest factor starting at i, such that its reverse copy starts before i. These problems have not been explicitly considered before, but they have several applications and they are natural extensions of the longest previous factor problem, which has been extensively studied. Moreover, the newly introduced tables store additional information on the structure of the string, helpful to improve, for example, gapped palindrome detection and text compression using reverse factors.