An optimal algorithm for generating minimal perfect hash functions
Information Processing Letters
Memory bandwidth limitations of future microprocessors
ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Theoretical Computer Science
Journal of Experimental Algorithmics (JEA)
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Analysis of the Search Performance of Coalesced Hashing
Journal of the ACM (JACM)
Journal of the ACM (JACM)
The processor-memory bottleneck: problems and solutions
Crossroads - Computer architecture
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Communications of the ACM
In-memory hash tables for accumulating text vocabularies
Information Processing Letters
Succinct indexable dictionaries with applications to encoding k-ary trees and multisets
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Low Redundancy in Static Dictionaries with Constant Query Time
SIAM Journal on Computing
Efficient Minimal Perfect Hashing in Nearly Minimal Space
STACS '01 Proceedings of the 18th Annual Symposium on Theoretical Aspects of Computer Science
CRYPTO '96 Proceedings of the 16th Annual International Cryptology Conference on Advances in Cryptology
Optimizing database architecture for the new bottleneck: memory access
The VLDB Journal — The International Journal on Very Large Data Bases
Journal of Algorithms
Queue - Databases
Architecture-conscious hashing
DaMoN '06 Proceedings of the 2nd international workshop on Data management on new hardware
Lock-free dynamic hash tables with open addressing
Distributed Computing - Special issue: PODC 02
Split-ordered lists: Lock-free extensible hash tables
Journal of the ACM (JACM)
External perfect hashing for very large key sets
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
DISC '08 Proceedings of the 22nd international symposium on Distributed Computing
Monotone minimal perfect hashing: searching a sorted table with O(1) accesses
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Indexing internal memory with minimal perfect hash functions
SBBD '08 Proceedings of the 23rd Brazilian symposium on Databases
Addressing for random-access storage
IBM Journal of Research and Development
Cache-Conscious collision resolution in string hash tables
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Simple and space-efficient minimal perfect hash functions
WADS'07 Proceedings of the 10th international conference on Algorithms and Data Structures
SILT: a memory-efficient, high-performance key-value store
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Practical perfect hashing in nearly optimal space
Information Systems
TJJE: An efficient algorithm for top-k join on massive data
Information Sciences: an International Journal
Memory efficient sanitization of a deduplicated storage system
FAST'13 Proceedings of the 11th USENIX conference on File and Storage Technologies
Consistency analysis on orientation features for fast and accurate palmprint identification
Information Sciences: an International Journal
Hi-index | 0.07 |
A perfect hash function (PHF) is an injective function that maps keys from a set S to unique values. Since no collisions occur, each key can be retrieved from a hash table with a single probe. A minimal perfect hash function (MPHF) is a PHF with the smallest possible range, that is, the hash table size is exactly the number of keys in S. MPHFs are widely used for memory efficient storage and fast retrieval of items from static sets. Differently from other hashing schemes, MPHFs completely avoid the problem of wasted space and wasted time to deal with collisions. Until recently, the amount of space to store an MPHF description for practical implementations found in the literature was O(logn) bits per key and therefore similar to the overhead of space of other hashing schemes. Recent results on MPHFs presented in the literature changed this scenario: an MPHF can now be described by approximately 2.6 bits per key. The objective of this paper is to show that MPHFs are, after the new recent results, a good option to index internal memory when static key sets are involved and both successful and unsuccessful searches are allowed. We have shown that MPHFs provide the best tradeoff between space usage and lookup time when compared with other open addressing and chaining hash schemes such as linear hashing, quadratic hashing, double hashing, dense hashing, cuckoo hashing, sparse hashing, hopscotch hashing, chaining with move to front heuristic and exact fit. We considered lookup time for successful and unsuccessful searches in two scenarios: (i) the MPHF description fits in the CPU cache and (ii) the MPHF description does not fit entirely in the CPU cache. Considering lookup time, the minimal perfect hashing outperforms the other hashing schemes in the two scenarios and, in the first scenario, the performance is better even when the compared methods leave more than 80% of the hash table entries free. Considering space overhead (the amount of used space other than the key-value pairs), the minimal perfect hashing is within a factor of O(logn) bits lower than the other hashing schemes for both scenarios.