Maintaining external memory efficient hash tables

  • Authors:
  • Philipp Woelfel

  • Affiliations:
  • Dept. of Computer Science, Univ. of Toronto, Toronto, ON

  • Venue:
  • APPROX'06/RANDOM'06 Proceedings of the 9th international conference on Approximation Algorithms for Combinatorial Optimization Problems, and 10th international conference on Randomization and Computation
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In typical applications of hashing algorithms the amount of data to be stored is often too large to fit into internal memory. In this case it is desirable to find the data with as few as possible non-consecutive or at least non-oblivious probes into external memory. Extending a static scheme of Pagh [11] we obtain new randomized algorithms for maintaining hash tables, where a hash function can be evaluated in constant time and by probing only one external memory cell or O(1) consecutive external memory cells. We describe a dynamic version of Pagh's hashing scheme achieving 100% table utilization but requiring (2+ε)nlogn space for the hash function encoding as well as (3+ε)nlogn space for the auxiliary data structure. Update operations are possible in expected constant amortized time. Then we show how to reduce the space for the hash function encoding and the auxiliary data structure to O(nloglogn). We achieve 100% utilization in the static version (and thus a minimal perfect hash function) and 1–ε utilization in the dynamic case.