Hashing practice: analysis of hashing and universal hashing

  • Authors:
  • M. V. Ramakrishna

  • Affiliations:
  • Michigan State Univ., East Lansing

  • Venue:
  • SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
  • Year:
  • 1988

Quantified Score

Hi-index 0.00

Visualization

Abstract

Much of the literature on hashing deals with overflow handling (collision resolution) techniques and its analysis. What does all the analytical results mean in practice and how can they be achieved with practical files? This paper considers the problem of achieving analytical performance of hashing techniques in practice with reference to successful search lengths, unsuccessful search lengths and the expected worst case performance (expected length of the longest probe sequence). There has been no previous attempt to explicitly link the analytical results to performance of real life files. Also, the previously reported experimental results deal mostly with successful search lengths. We show why the well known division method performs “well” under a specific model of selecting the test file. We formulate and justify an hypothesis that by choosing functions from a particular class of hashing functions, the analytical performance can be obtained in practice on real life files. Experimental results presented strongly support our hypothesis. Several interesting problems arising are mentioned in conclusion.