Generating quasi-random sequences from semi-random sources
Journal of Computer and System Sciences
Privacy amplification by public discussion
SIAM Journal on Computing - Special issue on cryptography
Unbiased bits from sources of weak randomness and probabilistic communication complexity
SIAM Journal on Computing - Special issue on cryptography
Hashing practice: analysis of hashing and universal hashing
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Practical performance of Bloom filters and parallel free-text searching
Communications of the ACM
Pseudo-random generation from one-way functions
STOC '89 Proceedings of the twenty-first annual ACM symposium on Theory of computing
The analysis of closed hashing under limited randomness
STOC '90 Proceedings of the twenty-second annual ACM symposium on Theory of computing
Handbook of algorithms and data structures: in Pascal and C (2nd ed.)
Handbook of algorithms and data structures: in Pascal and C (2nd ed.)
Coloring random and semi-random k-colorable graphs
Journal of Algorithms
Journal of Computer and System Sciences
Efficient Hardware Hashing Functions for High Performance Computers
IEEE Transactions on Computers
The art of computer programming, volume 3: (2nd ed.) sorting and searching
The art of computer programming, volume 3: (2nd ed.) sorting and searching
Extracting randomness: a survey and new constructions
Journal of Computer and System Sciences
SIAM Journal on Computing
Space/time trade-offs in hash coding with allowable errors
Communications of the ACM
Uniform hashing in constant time and linear space
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
Almost random graphs with simple hash functions
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
How asymmetry helps load balancing
Journal of the ACM (JACM)
On Universal Classes of Extremely Random Constant-Time Hash Functions
SIAM Journal on Computing
Tabulation based 4-universal hashing with applications to second moment estimation
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time
Journal of the ACM (JACM)
Journal of Algorithms
Asymmetric balanced allocation with simple hash functions
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Data streams: algorithms and applications
Foundations and Trends® in Theoretical Computer Science
Linear probing with constant independence
Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
Less hashing, same performance: building a better bloom filter
ESA'06 Proceedings of the 14th conference on Annual European Symposium - Volume 14
History-Independent Cuckoo Hashing
ICALP '08 Proceedings of the 35th international colloquium on Automata, Languages and Programming, Part II
Tight Bounds for Hashing Block Sources
APPROX '08 / RANDOM '08 Proceedings of the 11th international workshop, APPROX 2008, and 12th international workshop, RANDOM 2008 on Approximation, Randomization and Combinatorial Optimization: Algorithms and Techniques
String hashing for linear probing
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
On risks of using cuckoo hashing with simple universal hash classes
SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Weaknesses of Cuckoo Hashing with a Simple Universal Hash Class: The Case of Large Universes
SOFSEM '09 Proceedings of the 35th Conference on Current Trends in Theory and Practice of Computer Science
Smoothed analysis: an attempt to explain the behavior of algorithms in practice
Communications of the ACM - A View of Parallel Computing
Applications of a Splitting Trick
ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
An Analysis of Random-Walk Cuckoo Hashing
APPROX '09 / RANDOM '09 Proceedings of the 12th International Workshop and 13th International Workshop on Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques
Coordinated weighted sampling for estimating aggregates over multiple weight assignments
Proceedings of the VLDB Endowment
An improved analysis of the lossy difference aggregator
ACM SIGCOMM Computer Communication Review
Recursive n-gram hashing is pairwise independent, at best
Computer Speech and Language
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On the cell probe complexity of dynamic membership
SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Orientability of random hypergraphs and the power of multiple choices
ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming
Understanding bloom filter intersection for lazy address-set disambiguation
Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
The power of simple tabulation hashing
Proceedings of the forty-third annual ACM symposium on Theory of computing
Leftover Hash Lemma, revisited
CRYPTO'11 Proceedings of the 31st annual conference on Advances in cryptology
Private search in the real world
Proceedings of the 27th Annual Computer Security Applications Conference
An Analysis of Random-Walk Cuckoo Hashing
SIAM Journal on Computing
The Power of Simple Tabulation Hashing
Journal of the ACM (JACM)
SIAM Journal on Computing
Sharp load thresholds for cuckoo hashing
Random Structures & Algorithms
Maximum matchings in random bipartite graphs and the space utilization of Cuckoo Hash tables
Random Structures & Algorithms
Explicit and efficient hash families suffice for cuckoo hashing with a stash
ESA'12 Proceedings of the 20th Annual European conference on Algorithms
Software defined traffic measurement with OpenSketch
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Bottom-k and priority sampling, set similarity and subset sums with minimal independence
Proceedings of the forty-fifth annual ACM symposium on Theory of computing
Hi-index | 0.00 |
Hashing is fundamental to many algorithms and data structures widely used in practice. For theoretical analysis of hashing, there have been two main approaches. First, one can assume that the hash function is truly random, mapping each data item independently and uniformly to the range. This idealized model is unrealistic because a truly random hash function requires an exponential number of bits to describe. Alternatively, one can provide rigorous bounds on performance when explicit families of hash functions are used, such as 2-universal or O(1)-wise independent families. For such families, performance guarantees are often noticeably weaker than for ideal hashing. In practice, however, it is commonly observed that simple hash functions, including 2-universal hash functions, perform as predicted by the idealized analysis for truly random hash functions. In this paper, we try to explain this phenomenon. We demonstrate that the strong performance of universal hash functions in practice can arise naturally from a combination of the randomness of the hash function and the data. Specifially, following the large body of literature on random sources and randomness extraction, we model the data as coming from a "block source," whereby each new data item has some "entropy" given the previous ones. As long as the (Renyi) entropy per data item is sufficiently large, it turns out that the performance when choosing a hash function from a 2-universal family is essentially the same as for a truly random hash function. We describe results for several sample applications, including linear probing, balanced allocations, and Bloom filters.