Theory and practice of monotone minimal perfect hashing

Authors:
Djamal Belazzougui;Paolo Boldi;Rasmus Pagh;Sebastiano Vigna
Affiliations:
Université Paris Diderot--Paris 77, France;Università degli Studi di Milano, Italy;IT University of Copenhagen, Denmark;Università degli Studi di Milano, Italy
Venue:
Journal of Experimental Algorithmics (JEA)
Year:
2008

Citing 24
Cited 0

Storing a Sparse Table with 0(1) Worst Case Access Time

Journal of the ACM (JACM)
Order-preserving minimal perfect hash functions and information retrieval

ACM Transactions on Information Systems (TOIS) - Special issue on research and development in information retrieval
Design patterns: elements of reusable object-oriented software

Design patterns: elements of reusable object-oriented software
Efficient suffix trees on secondary storage

Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric

Journal of the ACM (JACM)
Efficient Storage and Retrieval by Content and Address of Static Files

Journal of the ACM (JACM)
WebBase: a repository of Web pages

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Succinct Representation of Balanced Parentheses and Static Trees

SIAM Journal on Computing
Efficient Minimal Perfect Hashing in Nearly Minimal Space

STACS '01 Proceedings of the 18th Annual Symposium on Theoretical Aspects of Computer Science
The Bloomier filter: an efficient data structure for static support lookup tables

SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
UbiCrawler: a scalable fully distributed web crawler

Software—Practice & Experience
Cores in random hypergraphs and Boolean formulas

Random Structures & Algorithms
A simple optimal representation for balanced parentheses

Theoretical Computer Science
Compressed data structures: Dictionaries and data-aware measures

Theoretical Computer Science
External perfect hashing for very large key sets

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Space-efficient static trees and graphs

SFCS '89 Proceedings of the 30th Annual Symposium on Foundations of Computer Science
Succinct Data Structures for Retrieval and Approximate Membership (Extended Abstract)

ICALP '08 Proceedings of the 35th international colloquium on Automata, Languages and Programming, Part I
Bloomier Filters: A Second Look

ESA '08 Proceedings of the 16th annual European symposium on Algorithms
A large time-aware web graph

ACM SIGIR Forum
Monotone minimal perfect hashing: searching a sorted table with O(1) accesses

SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Broadword implementation of rank/select queries

WEA'08 Proceedings of the 7th international conference on Experimental algorithms
Optimal lower bounds for rank and select indexes

ICALP'06 Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part I
Efficient implementation of rank and select functions for succinct representation

WEA'05 Proceedings of the 4th international conference on Experimental and Efficient Algorithms
Simple and space-efficient minimal perfect hash functions

WADS'07 Proceedings of the 10th international conference on Algorithms and Data Structures

Quantified Score

Hi-index	0.00

Visualization

Abstract

Minimal perfect hash functions have been shown to be useful to compress data in several data management tasks. In particular, order-preserving minimal perfect hash functions (Fox et al. 1991) have been used to retrieve the position of a key in a given list of keys; however, the ability to preserve any given order leads to an unavoidable Ω(n log n) lower bound on the number of bits required to store the function. Recently, it was observed (Belazzougui et al. 2009) that very frequently the keys to be hashed are sorted in their intrinsic (i.e., lexicographical) order. This is typically the case of dictionaries of search engines, list of URLs of Web graphs, and so on. We refer to this restricted version of the problem as monotone minimal perfect hashing. We analyze experimentally the data structures proposed in Belazzougui et al. [2009], and along our way we propose some new methods that, albeit asymptotically equivalent or worse, perform very well in practice and provide a balance between access speed, ease of construction, and space usage.