On the optimal nesting order for computing N-relational joins
ACM Transactions on Database Systems (TODS)
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Diag-Join: An Opportunistic Join Algorithm for 1:N Relationships
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Approximate join processing over data streams
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Static optimization of conjunctive queries with sliding windows over infinite streams
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
The VLDB Journal — The International Journal on Very Large Data Bases
On joining and caching stochastic streams
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Fine-Grain Adaptive Compression in Dynamically Variable Networks
ICDCS '05 Proceedings of the 25th IEEE International Conference on Distributed Computing Systems
Adaptive load shedding for windowed stream joins
Proceedings of the 14th ACM international conference on Information and knowledge management
Stream window join: tracking moving objects in sensor-network databases
SSDBM '03 Proceedings of the 15th International Conference on Scientific and Statistical Database Management
Characterizing and Exploiting Reference Locality in Data Stream Applications
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Design, implementation, and evaluation of the linear road bnchmark on the stream processing core
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Maximizing the output rate of multi-way join queries over streaming information sources
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Load shedding in a data stream manager
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Memory-limited execution of windowed stream joins
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Remembrance of streams past: overload-sensitive management of archived streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Querying sliding windows over online data streams
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
Multiple continuous queries evaluation over data streams
ACS'08 Proceedings of the 8th conference on Applied computer scince
Tools and strategies for debugging distributed stream processing applications
Software—Practice & Experience
BNCOD'11 Proceedings of the 28th British national conference on Advances in databases
Hi-index | 0.00 |
Tuple dropping, though commonly used for loadshedding in most data stream operations, is generally inadequatefor multi-way, windowed stream joins. The join output rate canbe unnecessarily reduced because tuple dropping fails to exploitthe time correlations likely to exist among interrelated streams.In this paper, we introduce GrubJoin - an adaptive, multi-way,windowed stream join that effectively performs time correlationawareCPU load shedding. GrubJoin maximizes the output rateby achieving near-optimal window harvesting, which picks onlythe most profitable segments of individual windows for the join.Due mainly to the combinatorial explosion of possible multi-wayjoin sequences involving different window segments, GrubJoinfaces unique challenges that do not exist for binary joins, suchas determining the optimal window harvesting configurationin a time efficient manner and learning the time correlationsamong the streams without introducing overhead. To tacklethese challenges, we formalize window harvesting as an optimizationproblem, develop greedy heuristics to determine nearoptimalwindow harvesting configurations and use approximationtechniques to capture the time correlations. Our experimentalresults show that GrubJoin is vastly superior to tuple droppingwhen time correlations exist and is equally effective when timecorrelations are nonexistent.