Adaptive query processing in data stream management systems under limited memory resources

  • Authors:
  • Fatima Farag;Moustafa Hammad;Reda Alhajj

  • Affiliations:
  • University of Calgary, Calgary, AB, Canada;Google Inc., Mountain View, CA, USA;University of Calgary, Calgary, AB, Canada

  • Venue:
  • PIKM '10 Proceedings of the 3rd workshop on Ph.D. students in information and knowledge management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many data stream sources are prone to periods of spikes in volume as well as periods of delays and silence. Because peak load during a spike can be orders of magnitude higher than a typical load, fully provisioning data stream monitoring system with all needed resources is generally difficult to achieve. Furthermore, data stream sources are subject to network delays and congestions as they connect to a data stream monitoring system over shared communication channels. Careless management of delays and periods of silence will eventually drop system performance drastically. Our research contribution investigates system performance during periods of peak load and periods of delays while supporting data stream applications, e.g., as in monitoring online stocks. We propose an algorithm, termed EM-SWJoin, that utilizes external memory data structures to keep up with the variable data arrival rates while keeping disk access latency at minimum. We also propose ADEDAS; an algorithm that guarantees an ordered release of output results while controlling the impact of delays over stream processing. Finally, we investigate how to deploy column-stores in data stream environments where column-oriented physical design approaches replace row-by-row data representations.