Counting distinct items over update streams

  • Authors:
  • Sumit Ganguly

  • Affiliations:
  • Department of Computer Science and Engineering, Indian Institute of Technology, Kanpur, India

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2007

Quantified Score

Hi-index 5.25

Visualization

Abstract

In data streaming applications, data arrives at rapid rates and in high volume, thus making it essential to process each stream update very efficiently in terms of both time and space. A data stream is a sequence of data records that must be processed continuously in an online fashion using sub-linear space and sub-linear processing time. We consider the problem of tracking the number of distinct items over data streams that allow insertion and deletion operations. We present two algorithms that improve on the space and time complexity of existing algorithms.