CPU load shedding for binary stream joins

Authors:
Bugra Gedik;Kun-Lung Wu;Philip S. Yu;Ling Liu
Affiliations:
IBM T.J. Watson Research Center, 19 Skyline Drive, 10532, Hawthorne, NY, USA and Georgia Institute of Technology, College of Computing, 19 Skyline Drive, 30332, Atlanta, GA, USA;IBM T.J. Watson Research Center, 19 Skyline Drive, 10532, Hawthorne, NY, USA;IBM T.J. Watson Research Center, 19 Skyline Drive, 10532, Hawthorne, NY, USA;Georgia Institute of Technology, College of Computing, 19 Skyline Drive, 30332, Atlanta, GA, USA
Venue:
Knowledge and Information Systems
Year:
2007

Citing 0
Cited 3

CellJoin: a parallel stream join operator for the cell processor

The VLDB Journal — The International Journal on Very Large Data Bases
Answering linear optimization queries with an approximate stream index

Knowledge and Information Systems
Frequency-based load shedding over a data stream of tuples

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present an adaptive load shedding approach for windowed stream joins. In contrast to the conventional approach of dropping tuples from the input streams, we explore the concept ofselective processing for load shedding. We allow stream tuples to be stored in the windows and shed excessive CPU load by performing the join operations, not on the entire set of tuples within the windows, but on a dynamically changing subset of tuples that are learned to be highly beneficial. We support such dynamic selective processing through three forms of runtimeadaptations: adaptation to input stream rates, adaptation to time correlation between the streams and adaptation to join directions. Our load shedding approach enables us to integrateutility-based load shedding withtime correlation-based load shedding. Indexes are used to further speed up the execution of stream joins. Experiments are conducted to evaluate our adaptive load shedding in terms of output rate and utility. The results show that our selective processing approach to load shedding is very effective and significantly outperforms the approach that drops tuples from the input streams.