A disk-based, adaptive approach to memory-limited computation of windowed stream joins

  • Authors:
  • Abhirup Chakraborty;Ajit Singh

  • Affiliations:
  • Dept. of Electrical and Computer Engineering, University of Waterloo, ON, Canada;Dept. of Electrical and Computer Engineering, University of Waterloo, ON, Canada

  • Venue:
  • DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of processing exact results for sliding window joins over data streams with limited memory. Existing approaches either, (a) deal with memory limitations by shedding loads, and therefore can not provide exact or even highly accurate results for sliding window joins over data streams showing time varying rate of data arrivals, or (b) suffer from large IO-overhead due to random disk flushes and disk-to-disk stages with a stream join, making the approaches inefficient to handle sliding window joins. We provide an Adaptive, Hash-partitioned Exact Window Join (AH-EWJ) algorithm incorporating disk storage as an archive. Our algorithm spills window data onto the disk on a periodic basis, and refines the output result by properly retrieving the disk resident data, and maximizes output rate by employing techniques to manage the memory blocks and by continuously adjusting the allocated memory within the stream windows. The problem of managing the window blocks in memory--similar in nature to the caching issue--captures both the temporal and frequency related properties of the stream arrivals. The algorithm adapts memory allocation both at a window level and a partition level. We provide experimental results demonstrating the performance and effectiveness of the proposed algorithm.