A Simple, Fast, and Compact Static Dictionary

  • Authors:
  • Scott Schneider;Michael Spertus

  • Affiliations:
  • Symantec Corporation, Culver City, USA 90230;Symantec Corporation, Culver City, USA 90230

  • Venue:
  • ISAAC '09 Proceedings of the 20th International Symposium on Algorithms and Computation
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a new static dictionary that is very fast and compact, while also extremely easy to implement. A combination of properties make this algorithm very attractive for applications requiring large static dictionaries: 1 High performance, with membership queries taking O(1)-time with a near-optimal constant. 1 Continued high performance in external memory, with queries requiring only 1-2 disk seeks. If the dictionary has n items in $\left\{ 0, ..., m\!-\!1 \right\}$ and d is the number of bytes retrieved from disk on each read, then the average number of seeks is $\min\left(1.63, 1 + O\left( \frac{\sqrt{n} \log m}{d} \right)\right)$. 1 Efficient use of space, storing n items from a universe of size m in $n \log m - \frac{1}{2} n \log n + O\left(n + \log \log m\right)$ bits. We prove this space bound with a novel application of the Kolmogorov-Smirnov distribution. 1 Simplicity, with a 20-line pseudo-code construction algorithm and 4-line query algorithm.