Encapsulation of parallelism in the Volcano query processing system
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Dataflow query execution in a parallel main-memory environment
Distributed and Parallel Databases - Selected papers from the first international conference on parallel and distributed information systems
Shrinking the warehouse update Window
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
How to roll a join: asynchronous incremental view maintenance
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
NiagaraCQ: a scalable continuous query system for Internet databases
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Continuously adaptive continuous queries over streams
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Using Segmented Right-Deep Trees for the Execution of Pipelined Hash Joins
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Practical Skew Handling in Parallel Joins
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Chain: operator scheduling for memory minimization in data stream systems
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Aurora: a new model and architecture for data stream management
The VLDB Journal — The International Journal on Very Large Data Bases
Hash-Merge Join: A Non-blocking Join Algorithm for Producing Fast and Early Join Results
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Revisiting pipelined parallelism in multi-join query processing
VLDB '05 Proceedings of the 31st international conference on Very large data bases
A dynamically adaptive distributed system for processing complex continuous queries
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Streaming queries over streaming data
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Maximizing the output rate of multi-way join queries over streaming information sources
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Load shedding in a data stream manager
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
State-slice: new paradigm of multi-query optimization of window-based stream queries
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Data-driven memory management for stream join
Information Systems
A new look at generating multi-join continuous query plans: A qualified plan generation problem
Data & Knowledge Engineering
Clustersheddy: load shedding using moving clusters over spatio-temporal data streams
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Processing exact results for sliding window joins over data streams using disk storage
International Journal of Intelligent Information and Database Systems
A disk-based, adaptive approach to memory-limited computation of windowed stream joins
DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part I
E-Cube: multi-dimensional event sequence analysis using hierarchical pattern query sharing
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
BNCOD'11 Proceedings of the 28th British national conference on Advances in databases
Scalable splitting of massive data streams
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
High-performance complex event processing using continuous sliding views
Proceedings of the 16th International Conference on Extending Database Technology
Integrating scale out and fault tolerance in stream processing using operator state management
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Overcoming memory limitations in high-throughput event-based applications
Proceedings of the 4th ACM/SPEC International Conference on Performance Engineering
Input data organization for batch processing in time window based computations
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Driver input selection for main-memory multi-way joins
Proceedings of the 28th Annual ACM Symposium on Applied Computing
A catalog of stream processing optimizations
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
Main memory is a critical resource when processing long-running queries over data streams with state intensive operators. In this work, we investigate state spill strategies that handle run-time memory shortage when processing such complex queries by selectively pushing operator states into disks. Unlike previous solutions which all focus on one single operator only, we instead target queries with multiple state intensive operators. We observe an interdependency among multiple operators in the query plan when spilling operator states. We illustrate that existing strategies, which do not take account of this interdependency, become largely ineffective in this query context. Clearly, a consolidated plan level spill strategy must be devised to address this problem. Several data spill strategies are proposed in this paper to maximize the run-time query throughput in memory constrained environments. The bottom-up state spill strategy is an operator-level strategy that treats all data in one operator state equally. More sophisticated partition-level data spill strategies are then proposed to take different characteristics of the input data into account, including the local output, the global output and the global output with penalty strategies. All proposed state spill strategies have been implemented in the D-CAPE continuous query system. The experimental results confirm the effectiveness of our proposed strategies. In particular, the global output strategy and the global output with penalty strategy have shown favorable results as compared to the other two more localized strategies.