When cycles are cheap, some tables can be huge

  • Authors:
  • Bin Fan;Dong Zhou;Hyeontaek Lim;Michael Kaminsky;David G. Andersen

  • Affiliations:
  • Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Intel Labs and Carnegie Mellon University;Carnegie Mellon University

  • Venue:
  • HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The goal of this paper is to raise a new question: What changes in operating systems and networks if it were feasible to have a (type of) lookup table that supported billions, or hundreds of billions, of entries, using only a few bits per entry. We do so by showing that the progress of Moore's law, continuing to give more and more transistors per chip, makes it possible to apply formerly ludicrous amounts of brute-force parallel computation to find spacesavings opportunities. We make two primary observations: First, that some applications can tolerate getting an incorrect answer from the table if they query for a key that is not in the table. For these applications, we can discard the keys entirely, using storage space only for the values. Further, for some applications, the value is not arbitrary. If the range of output values is small, we can instead view the problem as one of set separation. These two observations allow us to shrink the size of the mapping by brute force searching for a "perfect mapping" from inputs to outputs that (1) does not store the input keys; and (2) avoids collisions (and thus the related storage). Our preliminary results show that we can reduce memory consumption by an order of magnitude compared to traditional hash tables while providing competitive or better lookup performance.