Hashing practice: analysis of hashing and universal hashing

Authors:
M. V. Ramakrishna
Affiliations:
Michigan State Univ., East Lansing
Venue:
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Year:
1988

Citing 17
Cited 5

Storing a Sparse Table with 0(1) Worst Case Access Time

Journal of the ACM (JACM)
A probability model for overflow sufficiency in small hash tables

Communications of the ACM
Perfect hashing for external files

Perfect hashing for external files
File organization using composite perfect hashing

ACM Transactions on Database Systems (TODS)
The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms

The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
The art of computer programming, volume 3: (2nd ed.) sorting and searching

The art of computer programming, volume 3: (2nd ed.) sorting and searching
Analysis of bounded disorder file organization

Proceedings of the seventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Extendible hashing—a fast access method for dynamic files

ACM Transactions on Database Systems (TODS)
Expected Length of the Longest Probe Sequence in Hash Code Searching

Journal of the ACM (JACM)
Analysis of Uniform Hashing

Journal of the ACM (JACM)
File organization: implementation of a method guaranteeing retrieval in one access

Communications of the ACM
General performance analysis of key-to-address transformation methods using an abstract file concept

Communications of the ACM
Key-to-address transform techniques: a fundamental performance study on large existing formatted files

Communications of the ACM
Distribution-dependent hashing functions and their characteristics

SIGMOD '75 Proceedings of the 1975 ACM SIGMOD international conference on Management of data
An Exact Probability Model for Finite Hash Tables

Proceedings of the Fourth International Conference on Data Engineering
The Bounded Disorder Access Method

Proceedings of the Second International Conference on Data Engineering
The program complexity of searching a table (data structures, applied combinatorics)

The program complexity of searching a table (data structures, applied combinatorics)

Efficient Hardware Hashing Functions for High Performance Computers

IEEE Transactions on Computers
Managing Statistical Behavior of Large Data Sets in Shared-Nothing Architectures

IEEE Transactions on Parallel and Distributed Systems
Bounded Disorder File Organization

IEEE Transactions on Knowledge and Data Engineering
Why simple hash functions work: exploiting the entropy in a data stream

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
On the expected longest length probe sequence for hashing with separate chaining

Journal of Discrete Algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

Much of the literature on hashing deals with overflow handling (collision resolution) techniques and its analysis. What does all the analytical results mean in practice and how can they be achieved with practical files? This paper considers the problem of achieving analytical performance of hashing techniques in practice with reference to successful search lengths, unsuccessful search lengths and the expected worst case performance (expected length of the longest probe sequence). There has been no previous attempt to explicitly link the analytical results to performance of real life files. Also, the previously reported experimental results deal mostly with successful search lengths. We show why the well known division method performs “well” under a specific model of selecting the test file. We formulate and justify an hypothesis that by choosing functions from a particular class of hashing functions, the analytical performance can be obtained in practice on real life files. Experimental results presented strongly support our hypothesis. Several interesting problems arising are mentioned in conclusion.