A Compressed Enhanced Suffix Array Supporting Fast String Matching

  • Authors:
  • Enno Ohlebusch;Simon Gog

  • Affiliations:
  • Institute of Theoretical Computer Science, University of Ulm, Ulm D-89069;Institute of Theoretical Computer Science, University of Ulm, Ulm D-89069

  • Venue:
  • SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Index structures like the suffix tree or the suffix array are of utmost importance in stringology, most notably in exact string matching. In the last decade, research on compressed index structures has flourished because the main problem in many applications is the space consumption of the index. It is possible to simulate the matching of a pattern against a suffix tree on an enhanced suffix array by using range minimum queries or the so-called child table . In this paper, we show that the Super-Cartesian tree of the LCP-array (with which the suffix array is enhanced) very naturally explains the child table. More important, however, is the fact that the balanced parentheses representation of this tree constitutes a very natural compressed form of the child table which admits to locate all occ occurrences of pattern P of length m in O (m log|Σ| + occ ) time, where Σ is the underlying alphabet. Our compressed child table uses less space than previous solutions to the problem. An implementation is available.