Succinct Data Structures for Retrieval and Approximate Membership (Extended Abstract)

Authors:
Martin Dietzfelbinger;Rasmus Pagh
Affiliations:
Technische Universität Ilmenau, Ilmenau, Germany 98684;IT University of Copenhagen, København S, Denmark 2300
Venue:
ICALP '08 Proceedings of the 35th international colloquium on Automata, Languages and Programming, Part I
Year:
2008

Citing 0
Cited 11

Bloomier Filters: A Second Look

ESA '08 Proceedings of the 16th annual European symposium on Algorithms
Monotone minimal perfect hashing: searching a sorted table with O(1) accesses

SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Applications of a Splitting Trick

ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
An Optimal Bloom Filter Replacement Based on Matrix Solving

CSR '09 Proceedings of the Fourth International Computer Science Symposium in Russia on Computer Science - Theory and Applications
Tight thresholds for cuckoo hashing via XORSAT

ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming
Fast prefix search in little space, with applications

ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part I
Don't rush into a union: take time to find your roots

Proceedings of the forty-third annual ACM symposium on Theory of computing
Theory and practice of monotone minimal perfect hashing

Journal of Experimental Algorithmics (JEA)
Sharp load thresholds for cuckoo hashing

Random Structures & Algorithms
Various improvements to text fingerprinting

Journal of Discrete Algorithms
Memory efficient sanitization of a deduplicated storage system

FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

The retrieval problem is the problem of associatingdata with keys in a set. Formally, the data structure must store afunction $f\colon U\to \{0,1\}^r$ that has specified values on theelements of a given set S ⊆ U, |S|= n, but may have any value on elements outsideS. All known methods (e. g. those based on perfect hashfunctions), induce a space overhead of θ(n)bits over the optimum, regardless of the evaluation time. We showthat for any k, query time O(k) can beachieved using space that is within a factor 1 + e-k of optimal, asymptotically forlarge n. The time to construct the data structure isO(n), expected. If we allow logarithmicevaluation time, the additive overhead can be reduced toO(loglogn) bits whp. A general reductiontransfers the results on retrieval into analogous results onapproximate membership, a problem traditionally addressedusing Bloom filters. Thus we obtain space bounds arbitrarily closeto the lower bound for this problem as well. The evaluationprocedures of our data structures are extremely simple. For theresults stated above we assume free access to fully random hashfunctions. This assumption can be justified using spaceo(n) to simulate full randomness on a RAM.