The analysis of closed hashing under limited randomness
STOC '90 Proceedings of the twenty-second annual ACM symposium on Theory of computing
The C programming language
A reliable randomized algorithm for the closest-pair problem
Journal of Algorithms
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The C++ Programming Language
Universal Hashing and k-Wise Independent Random Variables via Integer Arithmetic without Primes
STACS '96 Proceedings of the 13th Annual Symposium on Theoretical Aspects of Computer Science
Polynomial Hash Functions Are Reliable (Extended Abstract)
ICALP '92 Proceedings of the 19th International Colloquium on Automata, Languages and Programming
Universal classes of hash functions (Extended Abstract)
STOC '77 Proceedings of the ninth annual ACM symposium on Theory of computing
Closed Hashing is Computable and Optimally Randomizable with Universal Hash Functions
Closed Hashing is Computable and Optimally Randomizable with Universal Hash Functions
Tabulation based 4-universal hashing with applications to second moment estimation
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Journal of Algorithms
Linear probing with constant independence
Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
Why simple hash functions work: exploiting the entropy in a data stream
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
The power of simple tabulation hashing
Proceedings of the forty-third annual ACM symposium on Theory of computing
Linear Probing with 5-wise Independence
SIAM Review
The universality of iterated hashing over variable-length strings
Discrete Applied Mathematics
The Power of Simple Tabulation Hashing
Journal of the ACM (JACM)
SIAM Journal on Computing
Hi-index | 0.00 |
Linear probing is one of the most popular implementations of dynamic hash tables storing all keys in a single array. When we get a key, we first hash it to a location. Next we probe consecutive locations until the key or an empty location is found. At STOC'07, Pagh et al. presented data sets where the standard implementation of 2-universal hashing leads to an expected number of Ω(log n) probes. They also showed that with 5-universal hashing, the expected number of probes is constant. Unfortunately, we do not have 5-universal hashing for, say, variable length strings. When we want to do such complex hashing from a complex domain, the generic standard solution is that we first do collision free hashing (w.h.p.) into a simpler intermediate domain, and second do the complicated hash function on this intermediate domain. Our contribution is that for an expected constant number of linear probes, it is suffices that each key has O(1) expected collisions with the first hash function, as long as the second hash function is 5-universal. This means that the intermediate domain can be n times smaller, and such a smaller intermediate domain typically means that the overall hash function can be made simpler and at least twice as fast. The same doubling of hashing speed for O(1) expected probes follows for most domains bigger than 32-bit integers, e.g., 64-bit integers and fixed length strings. In addition, we study how the overhead from linear probing diminishes as the array gets larger, and what happens if strings are stored directly as intervals of the array. These cases were not considered by Pagh et al.