Linear hash functions

Authors:
Noga Alon;Martin Dietzfelbinger;Peter Bro Miltersen;Erez Petrank;Gá/bor Tardos
Affiliations:
Tel-Aviv Univ., Tel-Aviv, Israel/ and Institute for Advanced Study, Princeton, NJ;Technische Univ. Ilmenau, Ilmenau, Germany;Univ. of Aarhus, Aarhus, Denmark;IBM, Haifa, Israel;Hugarian Academy of Sciences, Budapest, Hungary
Venue:
Journal of the ACM (JACM)
Year:
1999

Citing 9
Cited 10

Storing a Sparse Table with 0(1) Worst Case Access Time

Journal of the ACM (JACM)
Randomized and deterministic simulations of PRAMs by parallel machines with restricted granularity of parallel memories

Acta Informatica
A fast and simple randomized parallel algorithm for the maximal independent set problem

Journal of Algorithms
Introduction to algorithms

Introduction to algorithms
The computational complexity of universal hashing

Theoretical Computer Science - Special issue on structure in complexity theory
Dynamic Perfect Hashing: Upper and Lower Bounds

SIAM Journal on Computing
Sorting in linear time?

STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
A reliable randomized algorithm for the closest-pair problem

Journal of Algorithms
Polynomial Hash Functions Are Reliable (Extended Abstract)

ICALP '92 Proceedings of the 19th International Colloquium on Automata, Languages and Programming

A read-once branching program lower bound of Ω(2n/4) for integer multiplication using universal hashing

STOC '01 Proceedings of the thirty-third annual ACM symposium on Theory of computing
Viceroy: a scalable and dynamic emulation of the butterfly

Proceedings of the twenty-first annual symposium on Principles of distributed computing
Parity graph-driven read-once branching programs and an exponential lower bound for integer multiplication

Theoretical Computer Science
External perfect hashing for very large key sets

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Hashed samples: selectivity estimators for set similarity selection queries

Proceedings of the VLDB Endowment
Minimal perfect hashing: A competitive method for indexing internal memory

Information Sciences: an International Journal
Mitigating dictionary attacks on password-protected local storage

CRYPTO'06 Proceedings of the 26th annual international conference on Advances in Cryptology
Smaller footprint for java collections

ECOOP'12 Proceedings of the 26th European conference on Object-Oriented Programming
Practical perfect hashing in nearly optimal space

Information Systems
Simple and space-efficient minimal perfect hash functions

WADS'07 Proceedings of the 10th international conference on Algorithms and Data Structures

Quantified Score

Hi-index	0.02

Visualization

Abstract

Consider the set H of all linear (or affine) transformations between two vector spaces over a finite field F. We study how good H is as a class of hash functions, namely we consider hashing a set S of size n into a range having the same cardinality n by a randomly chosen function from H and look at the expected size of the largest hash bucket. H is a universal class of hash functions for any finite field, but with respect to our measure different fields behave differently.If the finite field F has n elements, then there is a bad set S ⊂ F2 of size n with expected maximal bucket size H(n1/3). If n is a perfect square, then there is even a bad set with largest bucket size always at least n. (This is worst possible, since with respect to a universal class of hash functions every set of size n has expected largest bucket size below n + 1/2.)If, however, we consider the field of two elements, then we get much better bounds. The best previously known upper bound on the expected size of the largest bucket for this class was O(2 log n). We reduce this upper bound to O(log n log logn). Note that this is not far from the guarantee for a random function. There, the average largest bucket would be &THgr;(log n/ log log n).In the course of our proof we develop a tool which may be of independent interest. Suppose we have a subset S of a vector space D over Z2, and consider a random linear mapping of D to a smaller vector space R. If the cardinality of S is larger than c&egr;|R|log|R|, then with probability 1 - &egr;, the image of S will cover all elements in the range.