An improved data stream summary: the count-min sketch and its applications

  • Authors:
  • Graham Cormode;S. Muthukrishnan

  • Affiliations:
  • Center for Discrete Mathematics and Computer Science (DIMACS), Rutgers University, Piscataway, NJ;Division of Computer and Information Systems, Rutgers University and AT&T Research

  • Venue:
  • Journal of Algorithms
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce a new sublinear space data structure--the count-min sketch--for summarizing data streams. Our sketch allows fundamental queries in data stream summarization such as point, range, and inner product queries to be approximately answered very quickly; in addition, it can be applied to solve several important problems in data streams such as finding quantiles, frequent items, etc. The time and space bounds we show for using the CM sketch to solve these problems significantly improve those previously known--typically from 1/ε2 to 1/ε in factor.