Succinct Data Structures for Retrieval and Approximate Membership (Extended Abstract)

  • Authors:
  • Martin Dietzfelbinger;Rasmus Pagh

  • Affiliations:
  • Technische Universität Ilmenau, Ilmenau, Germany 98684;IT University of Copenhagen, København S, Denmark 2300

  • Venue:
  • ICALP '08 Proceedings of the 35th international colloquium on Automata, Languages and Programming, Part I
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The retrieval problem is the problem of associatingdata with keys in a set. Formally, the data structure must store afunction $f\colon U\to \{0,1\}^r$ that has specified values on theelements of a given set S ⊆ U, |S|= n, but may have any value on elements outsideS. All known methods (e. g. those based on perfect hashfunctions), induce a space overhead of θ(n)bits over the optimum, regardless of the evaluation time. We showthat for any k, query time O(k) can beachieved using space that is within a factor 1 + e-k of optimal, asymptotically forlarge n. The time to construct the data structure isO(n), expected. If we allow logarithmicevaluation time, the additive overhead can be reduced toO(loglogn) bits whp. A general reductiontransfers the results on retrieval into analogous results onapproximate membership, a problem traditionally addressedusing Bloom filters. Thus we obtain space bounds arbitrarily closeto the lower bound for this problem as well. The evaluationprocedures of our data structures are extremely simple. For theresults stated above we assume free access to fully random hashfunctions. This assumption can be justified using spaceo(n) to simulate full randomness on a RAM.