Efficient Aggregate Computation over Data Streams

  • Authors:
  • Kanthi Nagaraj;K. V. M. Naidu;Rajeev Rastogi;Scott Satkin

  • Affiliations:
  • Bell Labs Research, Bangalore, India. kanthicn@alcatel-lucent.com;Bell Labs Research, Bangalore, India. naidukvm@alcatel-lucent.com;Bell Labs Research, Bangalore, India. rastogi@alcatel-lucent.com;Bell Labs Research, Murray Hill, New Jersey, USA. scott@satkin.com

  • Venue:
  • ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Cisco's NetFlow Collector (NFC) is a powerful example of a real-world product that supports multiple aggregate queries over a continuous stream of IP flow records. NFC enables a plethora of network management tasks like traffic demands estimation, application traffic profiling, etc. In this paper, we investigate two computation sharing techniques for enabling streaming applications such as NFC to scale to hundreds of queries. Our first technique instantiates certain intermediate aggregates which are then used to generate the final answers for input queries. Our second technique coalesces the filter conditions of similar queries and uses the coalesced filter to pre-filter stream data input to these queries. Using these techniques, we propose a heuristic to compute a good query plan and perform extensive simulations to show that our heuristic delivers a factor of over 3 performance improvement compared to a naive approach.