Succinct indexes for strings, binary relations and multilabeled trees

  • Authors:
  • Jérémy Barbay;Meng He;J. Ian Munro;Srinivasa Rao Satti

  • Affiliations:
  • University of Chile, Chile;University of Waterloo, Canada;University of Waterloo, Canada;Seoul National University, South Korea

  • Venue:
  • ACM Transactions on Algorithms (TALG)
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We define and design succinct indexes for several abstract data types (ADTs). The concept is to design auxiliary data structures that ideally occupy asymptotically less space than the information-theoretic lower bound on the space required to encode the given data, and support an extended set of operations using the basic operators defined in the ADT. The main advantage of succinct indexes as opposed to succinct (integrated data/index) encodings is that we make assumptions only on the ADT through which the main data is accessed, rather than the way in which the data is encoded. This allows more freedom in the encoding of the main data. In this article, we present succinct indexes for various data types, namely strings, binary relations and multilabeled trees. Given the support for the interface of the ADTs of these data types, we can support various useful operations efficiently by constructing succinct indexes for them. When the operators in the ADTs are supported in constant time, our results are comparable to previous results, while allowing more flexibility in the encoding of the given data. Using our techniques, we design a succinct encoding that represents a string of length n over an alphabet of size σ using nHk(S) + lg σ · o(n) + O(n lg σ/lg lg lg σ) bits to support access/rank/select operations in o((lg lg σ)1+ε) time, for any fixed constant ε 0. We also design a succinct text index using n H0(S) + O(n lg σ/lg lg σ) bits that supports finding all the occ occurrences of a given pattern of length m in O(m lg lg σ + occ lg n/lgε σ) time, for any fixed constant 0