External-memory multimaps

  • Authors:
  • Elaine Angelino;Michael T. Goodrich;Michael Mitzenmacher;Justin Thaler

  • Affiliations:
  • School of Engineering and Applied Sciences, Harvard University;Dept. of Computer Science, University of California, Irvine;School of Engineering and Applied Sciences, Harvard University;School of Engineering and Applied Sciences, Harvard University

  • Venue:
  • ISAAC'11 Proceedings of the 22nd international conference on Algorithms and Computation
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many data structures support dictionaries, also known as maps or associative arrays, which store and manage a set of key-value pairs. A multimap is a generalization that allows multiple values to be associated with the same key. For example, the inverted file data structure commonly used in search engines is a type of multimap, with words as keys and document pointers as values. We study the multimap abstract data type and how it can be implemented efficiently online in external memory frameworks, with constant expected I/O performance. The key technique used to achieve our results is a combination of cuckoo hashing using buckets that hold multiple items with a multiqueue implementation to cope with varying numbers of values per key. Our results are provably optimal up to constant factors.