A categorization theorem on suffix arrays with applications to space efficient text indexes

  • Authors:
  • Meng He;J. Ian Munro;S. Srinivasa Rao

  • Affiliations:
  • University of Waterloo, Waterloo, ON, Canada;University of Waterloo, Waterloo, ON, Canada;University of Waterloo, Waterloo, ON, Canada

  • Venue:
  • SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we design succinct index structures for a text string T of n binary symbols to support efficient searching of a pattern P of length m. Motivated by the fact that the standard representation of suffix arrays uses n lg n bits which is more than the theoretical minimum, we present a theorem that characterizes a permutation as the suffix array of a binary string. Based on the theorem, we design a succinct representation of suffix arrays of binary strings that uses n + o(n) bits, which is the theoretical minimum plus a lower order term, and answers existential and cardinality queries in O(m) time without storing the raw text. With 2n+o(n) bits, we can list pattern occurrences in O(m + occ lg n) time in the general case, and for long patterns, when m = Ω(lg1+∈ n), we answer such listing queries in O(m + occ) time. We also present another implementation that uses O(n) bits and supports pattern searching in O(m + occ lgλ n) time for any fixed λ such that 0