The power of simple tabulation hashing

Authors:
Mihai Patrascu;Mikkel Thorup
Affiliations:
AT&T Labs, Florham Park, NJ, USA;AT&T Labs, Florham Park, NJ, USA
Venue:
Proceedings of the forty-third annual ACM symposium on Theory of computing
Year:
2011

Citing 13
Cited 9

Randomized algorithms and pseudorandom numbers

Journal of the ACM (JACM)
Randomized algorithms

Randomized algorithms
Chernoff-Hoeffding Bounds for Applications with Limited Independence

SIAM Journal on Discrete Mathematics
A small approximately min-wise independent family of hash functions

Journal of Algorithms
Almost random graphs with simple hash functions

Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
On Universal Classes of Extremely Random Constant-Time Hash Functions

SIAM Journal on Computing
Tabulation based 4-universal hashing with applications to second moment estimation

SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Why simple hash functions work: exploiting the entropy in a data stream

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
New classes and applications of hash functions

SFCS '79 Proceedings of the 20th Annual Symposium on Foundations of Computer Science
String hashing for linear probing

SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Applications of a Splitting Trick

ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
Linear Probing with Constant Independence

SIAM Journal on Computing
On the k-independence required by linear probing and minwise independence

ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming

Linear Probing with 5-wise Independence

SIAM Review
Compressed matrix multiplication

Proceedings of the 3rd Innovations in Theoretical Computer Science Conference
Using hashing to solve the dictionary problem

Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete Algorithms
Deterministic parallel random-number generation for dynamic-multithreading platforms

Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Independence of tabulation-based hash classes

LATIN'12 Proceedings of the 10th Latin American international conference on Theoretical Informatics
Tabulation-Based 5-Independent Hashing with Applications to Linear Probing and Second Moment Estimation

SIAM Journal on Computing
Explicit and efficient hash families suffice for cuckoo hashing with a stash

ESA'12 Proceedings of the 20th Annual European conference on Algorithms
Fast and scalable polynomial kernels via explicit feature maps

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Compressed matrix multiplication

ACM Transactions on Computation Theory (TOCT) - Special issue on innovations in theoretical computer science 2012

Quantified Score

Hi-index	0.00

Visualization

Abstract

Randomized algorithms are often enjoyed for their simplicity, but the hash functions used to yield the desired theoretical guarantees are often neither simple nor practical. Here we show that the simplest possible tabulation hashing provides unexpectedly strong guarantees. The scheme itself dates back to Carter and Wegman (STOC'77). Keys are viewed as consisting of c characters. We initialize c tables T_1, ..., T_c mapping characters to random hash codes. A key x=(x_1, ..., x_c) is hashed to T_1[x_1] xor ... xor T_c[x_c]. While this scheme is not even 4-independent, we show that it provides many of the guarantees that are normally obtained via higher independence, e.g., Chernoff-type concentration, min-wise hashing for estimating set intersection, and cuckoo hashing.