Compact Suffix Array — A Space-Efficient Full-Text Index

  • Authors:
  • Veli Mäkinen

  • Affiliations:
  • Department of Computer Science, P.O. Box 26 (Teollisuuskatu 23), FIN-00014 University of Helsinki, Finland

  • Venue:
  • Fundamenta Informaticae - Computing Patterns in Strings
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Suffix array is a widely used full-text index that allows fast searches on the text. It is constructed by sorting all suffixes of the text in the lexicographic order and storing pointers to the suffixes in this order. Binary search is used for fast searches on the suffix array. Compact suffix array is a compressed form of the suffix array that still allows binary searches, but the search times are also dependent on the compression. In this paper, we give efficientmethods for constructing and querying compact suffix arrays. We also study practical issues, such as the trade off between compression and search times, and show how to reduce the space requirement of the construction. Experimental results are provided in comparison with other search methods. With a large text corpora, the index took 1.6 times the size of the text, while the searches were only two times slower than from a suffix array.