Time series: theory and methods
Time series: theory and methods
Parallel database systems: the future of high performance database systems
Communications of the ACM
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
A new algorithm for optimal bin packing
Eighteenth national conference on Artificial intelligence
Dynamic Load Distribution in the Borealis Stream Processor
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Semantics and evaluation techniques for window aggregates in data streams
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Customizable parallel execution of scientific stream queries
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Flux: a mechanism for building robust, scalable dataflows
Flux: a mechanism for building robust, scalable dataflows
Distributed Resource Management and Admission Control of Stream Processing Systems with Max Utility
ICDCS '07 Proceedings of the 27th International Conference on Distributed Computing Systems
Linear road: a stream data management benchmark
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Query-aware partitioning for monitoring massive network data streams
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Elastic scaling of data parallel operators in stream processing
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Cassandra: a decentralized structured storage system
ACM SIGOPS Operating Systems Review
Feeding frenzy: selectively materializing users' event feeds
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
StreamCloud: A Large Scale Data Streaming System
ICDCS '10 Proceedings of the 2010 IEEE 30th International Conference on Distributed Computing Systems
A data stream-based evaluation framework for traffic information systems
Proceedings of the ACM SIGSPATIAL International Workshop on GeoStreaming
Changing flights in mid-air: a model for safely modifying continuous queries
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Auto-parallelizing stateful distributed streaming applications
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
StreamCloud: An Elastic and Scalable Data Streaming System
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
In this paper, we propose a framework for adaptive admission control and management of a large number of dynamic input streams in parallel stream processing engines. The framework takes as input any available information about input stream behaviors and the requirements of the query processing layer, and adaptively decides how to adjust the entry points of streams to the system. As the optimization decisions propagate early from input management layer to the query processing layer, the size of the cluster is minimized, the load balance is maintained, and latency bounds of queries are met in a more effective and timely manner. Declarative integration of external meta-data about data sources makes the system more robust and resource-efficient. Additionally, exploiting knowledge about queries moves data partitioning to the input management layer, where better load balance for query processing can be achieved. We implemented these techniques as a part of the Borealis stream processing system and conducted experiments showing the performance benefits of our framework.