Bottom-k sketches: better and more efficient estimation of aggregates

  • Authors:
  • Edith Cohen;Haim Kaplan

  • Affiliations:
  • AT&T Labs-Research;Tel Aviv University

  • Venue:
  • Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

A Bottom-k sketch is a summary of a set of items with nonnegative weights. Each such summary allows us to compute approximate aggregates over the set of items. Bottom-k sketches are obtained by associating with each item in a ground set an independent random rank drawn from a probability distribution that depends on the weight of the item. For each subset of interest, the bottom-k sketch is the set of the k minimum ranked items and their ranks. Bottom-k sketches have numerous applications. We develop and analyze data structures and estimators for bottom-k sketches to facilitate their deployment. We develop novel estimators and algorithms that show that they are a superior alternative to other sketching methods in both efficiency of obtaining the sketches and the accuracy of the estimates derived from the sketches.