Dynamic adaptive data structures for monitoring data streams

  • Authors:
  • J. Aguilar-Saborit;P. Trancoso;V. Muntes-Mulero;J. L. Larriba-Pey

  • Affiliations:
  • IBM Toronto Laboratory, 8200 Warden Avenue, Markham, ON, Canada L6G1C7;Department of Computer Science, University of Cyprus, Nicosia, Cyprus;DAMA-UPC, Computer Architecture Department, Universitat Politecnica de Catalunya, Spain;DAMA-UPC, Computer Architecture Department, Universitat Politecnica de Catalunya, Spain

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The monitoring of data streams is a very important issue in many different areas. Aspects such as accuracy, the speed of response, the use of memory and the adaptability to the changing nature of data may vary in importance depending on the situation. Examples such as Web page access monitoring, approximate aggregation in relational queries or IP message routing are clear examples of a varied range of those needs. There are different data structures that deal with this problem such as the counting bloom filters, the spectral bloom filters and the dynamic count filters. Those data structures range from static to complex dynamic representations of the data stream that keep an approximate count of the number of occurrences for each data value. In this paper, we focus on three main aspects. First, we analyze the problem in perspective and review the existing static and dynamic solutions. Second, we propose and analyze in depth a simple yet powerful partitioning strategy that reinforces the advantages of the methods proposed up to now solving most of their drawbacks. Finally, using real executions and mathematical models, we evaluate the existing methods alone and in combination with our partitioning strategy. We show that with our partitioning strategy, it is possible to reduce the memory requirements and average response time, improving the adaptiveness to changing data characteristics and leaving the accuracy of the partitioned dynamic data structures intact.