The mathematics of nonlinear programming
The mathematics of nonlinear programming
Random number generation and quasi-Monte Carlo methods
Random number generation and quasi-Monte Carlo methods
Multi-dimensional resource scheduling for parallel queries
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Self-similarity in World Wide Web traffic: evidence and possible causes
IEEE/ACM Transactions on Networking (TON)
Scheduling and Load Balancing in Parallel and Distributed Systems
Scheduling and Load Balancing in Parallel and Distributed Systems
Aurora: a data stream management system
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
TelegraphCQ: continuous dataflow processing
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Aurora: a new model and architecture for data stream management
The VLDB Journal — The International Journal on Very Large Data Bases
Dynamic Load Distribution in the Borealis Stream Processor
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Contract-based load management in federated distributed systems
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Potential-driven load distribution for distributed data stream processing
SSPS '08 Proceedings of the 2nd international workshop on Scalable stream processing system
SODA: an optimizing scheduler for large-scale stream-based distributed computer systems
Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware
Online pairing of VoIP conversations
The VLDB Journal — The International Journal on Very Large Data Bases
A stratified approach for supporting high throughput event processing applications
Proceedings of the Third ACM International Conference on Distributed Event-Based Systems
Proceedings of the 18th ACM conference on Information and knowledge management
Smart distribution of bio-signal processing tasks in m-health
OTM'07 Proceedings of the 2007 OTM confederated international conference on On the move to meaningful internet systems - Volume Part I
Detouring and replication for fast and reliable internet-scale stream processing
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
A model for continuous query latencies in data streams
Proceedings of the First International Workshop on Algorithms and Models for Distributed Event Processing
UpStream: storage-centric load management for streaming applications with update semantics
The VLDB Journal — The International Journal on Very Large Data Bases
M-TOP: multi-target operator placement of query graphs for data streams
Proceedings of the 15th Symposium on International Database Engineering & Applications
Managing parallelism for stream processing in the cloud
Proceedings of the 1st International Workshop on Hot Topics in Cloud Data Processing
Understanding and improving the cost of scaling distributed event processing
Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
Adaptive online scheduling in storm
Proceedings of the 7th ACM international conference on Distributed event-based systems
QoS-aware optimization of sensor network queries
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
Scalability in stream processing systems can be achieved by using a cluster of computing devices. The processing burden can, thus, be distributed among the nodes by partitioning the query graph. The specific operator placement plan can have a huge impact on performance. Previous work has focused on how to move query operators dynamically in reaction to load changes in order to keep the load balanced. Operator movement is too expensive to alleviate short-term bursts; moreover, some systems do not support the ability to move operators dynamically. In this paper, we develop algorithms for selecting an operator placement plan that is resilient to changes in load. In other words, we assume that operators cannot move, therefore, we try to place them in such a way that the resulting system will be able to withstand the largest set of input rate combinations. We call this a resilient placement.This paper first formalizes the problem for operators that exhibit linear load characteristics (e.g., filter, aggregate), and introduces a resilient placement algorithm. We then show how we can extend our algorithm to take advantage of additional workload information (such as known minimum input stream rates). We further show how this approach can be extended to operators that exhibit non-linear load characteristics (e.g., join). Finally, we present prototype- and simulation-based experiments that quantify the benefits of our approach over existing techniques using real network traffic traces.