Efficient algorithms for three variants of the LPF table

Authors:
Maxime Crochemore;Costas S. Iliopoulos;Marcin Kubica;Wojciech Rytter;Tomasz Waleń
Affiliations:
Kings College London, London WC2R 2LS, UK and Université Paris-Est, France;Kings College London, London WC2R 2LS, UK and Digital Ecosystems & Business Intelligence Institute, Curtin University of Technology, Perth WA 6845, Australia;Institute of Informatics, University of Warsaw, Warsaw, Poland;Institute of Informatics, University of Warsaw, Warsaw, Poland and Department of Mathematics and Informatics, Copernicus University, Toruń, Poland;Institute of Informatics, University of Warsaw, Warsaw, Poland
Venue:
Journal of Discrete Algorithms
Year:
2012

Citing 23
Cited 3

Transducers and repetitions

Theoretical Computer Science
Text compression

Text compression
Detecting leftmost maximal periodicities

Discrete Applied Mathematics - Combinatorics and complexity
Suffix arrays: a new method for on-line string searches

SIAM Journal on Computing
A New Linear-Time ``On-Line'' Algorithm for Finding the Smallest Initial Palindrome of a String

Journal of the ACM (JACM)
The LCA Problem Revisited

LATIN '00 Proceedings of the 4th Latin American Symposium on Theoretical Informatics
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications

CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Finding Maximal Repetitions in a Word in Linear Time

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Scaling and related techniques for geometry problems

STOC '84 Proceedings of the sixteenth annual ACM symposium on Theory of computing
Computing quasi suffix arrays

Journal of Automata, Languages and Combinatorics - Special issue: Selected papers of the 13th Australasian workshop on combinatorial algorithms
Algorithms on Strings

Algorithms on Strings
Succinct data structures for flexible text retrieval systems

Journal of Discrete Algorithms
Algorithmic Aspects of Bioinformatics (Natural Computing Series)

Algorithmic Aspects of Bioinformatics (Natural Computing Series)
Computing Longest Previous Factor in linear time and applications

Information Processing Letters
A Simple Algorithm for Computing the Lempel Ziv Factorization

DCC '08 Proceedings of the Data Compression Conference
Searching for Gapped Palindromes

CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
Linear Time Suffix Array Construction Using D-Critical Substrings

CPM '09 Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching
LPF Computation Revisited

Combinatorial Algorithms
Linear-time construction of suffix arrays

CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Space efficient linear time construction of suffix arrays

CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Simple linear work suffix array construction

ICALP'03 Proceedings of the 30th international conference on Automata, languages and programming
Theoretical and practical improvements on the RMQ-Problem, with applications to LCA and LCE

CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
A new succinct representation of RMQ-information and improvements in the enhanced suffix array

ESCAPE'07 Proceedings of the First international conference on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies

Variations of the parameterized longest previous factor

Journal of Discrete Algorithms
On parsing optimality for dictionary-based text compression-the Zip case

Journal of Discrete Algorithms
The Forward Stem Matrix: An Efficient Data Structure for Finding Hairpins in RNA Secondary Structures

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The concept of a longest previous factor (LPF) is inherent to Ziv-Lempel factorization of strings in text compression, as well as in statistics of repetitions and symmetries. It is expressed in the form of a table - LPF[i] is the maximum length of a factor starting at position i, that also appears earlier in the given text. We show how to compute efficiently three new tables storing different variants of previous factors (past segments) of a string. The longest previous non-overlapping factor, for a given position i, is the longest factor starting at i which has an exact copy occurring entirely before, while the longest previous non-overlapping reverse factor for a given position i is the longest factor starting at i, such that its reverse copy occurs entirely before. In both problems the previous copies of the factors are required to occur within the prefix ending at position i-1. The longest previous (possibly overlapping) reverse factor is the longest factor starting at i, such that its reverse copy starts before i. These problems have not been explicitly considered before, but they have several applications and they are natural extensions of the longest previous factor problem, which has been extensively studied. Moreover, the newly introduced tables store additional information on the structure of the string, helpful to improve, for example, gapped palindrome detection and text compression using reverse factors.