Compressed suffix trees: Efficient computation and storage of LCP-values

  • Authors:
  • Simon Gog;Enno Ohlebusch

  • Affiliations:
  • The University of Melbourne, Parkville, Australia;Ulm University, Ulm, Germany

  • Venue:
  • Journal of Experimental Algorithmics (JEA)
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The suffix tree is a very important data structure in string processing, but typical implementations suffer from huge space consumption. In large-scale applications, compressed suffix trees (CSTs) are therefore used instead. A CST consists of three (compressed) components: the suffix array, the longest common prefix (LCP)-array and data structures for simulating navigational operations on the suffix tree. The LCP-array stores the lengths of the LCPs of lexicographically adjacent suffixes, and it can be computed in linear time. In this article, we present a new LCP-array construction algorithm that is fast and very space efficient. In practice, our algorithm outperforms alternative algorithms. Moreover, we introduce a new compressed representation of LCP-arrays.