Load shedding for window joins over streams

Authors:
Dong-Hong Han;Guo-Ren Wang;Chuan Xiao;Rui Zhou
Affiliations:
Institute of Computer System, Northeastern University, Shenyang, China;Institute of Computer System, Northeastern University, Shenyang, China;Institute of Computer System, Northeastern University, Shenyang, China;Institute of Computer System, Northeastern University, Shenyang, China
Venue:
Journal of Computer Science and Technology
Year:
2007

Citing 9
Cited 0

A new data structure for cumulative frequency tables

Software—Practice & Experience
NiagaraCQ: a scalable continuous query system for Internet databases

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximate join processing over data streams

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Aurora: a new model and architecture for data stream management

The VLDB Journal — The International Journal on Very Large Data Bases
Static optimization of conjunctive queries with sliding windows over infinite streams

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
On joining and caching stochastic streams

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Memory-limited execution of windowed stream joins

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Load shedding for window joins over streams

WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management

Quantified Score

Hi-index	0.00

Visualization

Abstract

We address several load shedding techniques over sliding window joins. We first construct a dual window architectural model including aux-windows and join-windows, and build statistics on aux-windows. With the statistics, we develop an effective load shedding strategy producing maximum subset join outputs. In order to accelerate the load shedding process, binary indexed trees have been utilized to reduce the cost on shedding evaluation. When streams have high arrival rates, we propose an approach incorporating front-shedding and rear-shedding, and find an optimal trade-off between them. As for the scenarios of variable speed ratio, we develop a plan reallocating CPU resources and dynamically resizing the windows. In addition, we prove that load shedding is not affected during the process of reallocation. Both synthetic and real data are used in our experiments, and the results show the promise of our strategies.