Indexing internal memory with minimal perfect hash functions

Authors:
Fabiano C. Botelho;Hendrickson R. Langbehn;Guilherme Vale Menezes;Nivio Ziviani
Affiliations:
Federal Univ. of Minas Gerais, Belo Horizonte, Brazil and Federal Center for Technological Education of Minas Gerais, Belo Horizonte, Brazil;Federal Univ. of Minas Gerais, Belo Horizonte, Brazil;Federal Univ. of Minas Gerais, Belo Horizonte, Brazil;Federal Univ. of Minas Gerais, Belo Horizonte, Brazil
Venue:
SBBD '08 Proceedings of the 23rd Brazilian symposium on Databases
Year:
2008

Citing 10
Cited 1

An optimal algorithm for generating minimal perfect hash functions

Information Processing Letters
Graph structure in the Web

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Low Redundancy in Static Dictionaries with Constant Query Time

SIAM Journal on Computing
Optimizing database architecture for the new bottleneck: memory access

The VLDB Journal — The International Journal on Very Large Data Bases
Application of Minimal Perfect Hashing in Main Memory Indexing

Application of Minimal Perfect Hashing in Main Memory Indexing
Cuckoo hashing

Journal of Algorithms
Beyond Relational Databases

Queue - Databases
Architecture-conscious hashing

DaMoN '06 Proceedings of the 2nd international workshop on Data management on new hardware
Simple and space-efficient minimal perfect hash functions

WADS'07 Proceedings of the 10th international conference on Algorithms and Data Structures

Minimal perfect hashing: A competitive method for indexing internal memory

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

A perfect hash function (PHF) is an injective function that maps keys from a set S to unique values, which are in turn used to index a hash table. Since no collisions occur, each key can be retrieved from the table with a single probe. A minimal perfect hash function (MPHF) is a PHF with the smallest possible range, that is, the hash table size is exactly the number of keys in S. MPHFs are widely used for memory efficient storage and fast retrieval of items from static sets. Differently from other hashing schemes, MPHFs completely avoid the problem of wasted space and wasted time to deal with collisions. In the past, the amount of space to store an MPHF description was O(log n) bits per key and therefore similar to the overhead of space of other hashing schemes. Recent results on MPHFs by [Botelho et al. 2007] changed this scenario: in their work the space overhead of an MPHF is approximately 2.6 bits per key. The objective of this paper is to show that MPHFs are a good option to index internal memory when static key sets are involved and both successful and unsuccessful searches are allowed. We have shown that MPHFs provide the best tradeoff between space usage and lookup time when compared with linear hashing, quadratic hashing, double hashing, dense hashing, cuckoo hashing and sparse hashing. For example, MPHFs outperforms linear hashing, quadratic hashing and double hashing when these methods have a hash table occupancy of 75% or higher (if the MPHF fits in the CPU cache the same happens for hash table occupancies greater than or equal to 55%). Furthermore, MPHFs also have a better performance in all measured aspects when compared to sparse hashing, which has been designed specifically for efficient memory usage.