Multicast with aggregated deliveries
Proceedings of the First International Workshop on Algorithms and Models for Distributed Event Processing
Elastic complex event processing
Proceedings of the 8th Middleware Doctoral Symposium
Virtualizing stream processing
Middleware'11 Proceedings of the 12th ACM/IFIP/USENIX international conference on Middleware
Stormy: an elastic and highly available streaming service in the cloud
Proceedings of the 2012 Joint EDBT/ICDT Workshops
A scalable complex event processing system and evaluations of its performance
Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
Understanding and improving the cost of scaling distributed event processing
Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
Virtualizing stream processing
Proceedings of the 12th International Middleware Conference
Revenue-Based resource management on shared clouds for heterogenous bursty data streams
GECON'12 Proceedings of the 9th international conference on Economics of Grids, Clouds, Systems, and Services
Multicasting in the presence of aggregated deliveries
Journal of Parallel and Distributed Computing
Anomaly management using complex event processing: extending data base technology paper
Proceedings of the 16th International Conference on Extending Database Technology
STONE: a stream-based DDoS defense framework
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Adaptive input admission and management for parallel stream processing
Proceedings of the 7th ACM international conference on Distributed event-based systems
Cloudy: heterogeneous middleware for in time queries processing
Proceedings of the 17th International Database Engineering & Applications Symposium
Hi-index | 0.00 |
Data streaming has become an important paradigm for the real-time processing of continuous data flows in domains such as finance, telecommunications, networking, Some applications in these domains require to process massive data flows that current technology is unable to manage, that is, streams that, even for a single query operator, require the capacity of potentially many machines. Research efforts on data streaming have mainly focused on scaling in the number of queries or query operators, but overlooked the scalability issue with respect to the stream volume. In this paper, we present StreamCloud a large scale data streaming system for processing large data stream volumes. We focus on how to parallelize continuous queries to obtain a highly scalable data streaming infrastructure. StreamCloud goes beyond the state of the art by using a novel parallelization technique that splits queries into subqueries that are allocated to independent sets of nodes in a way that minimizes the distribution overhead. StreamCloud is implemented as a middleware and is highly independent of the underlying data streaming engine. We explore and evaluate different strategies to parallelize data streaming and tackle with the main bottlenecks and overheads to achieve scalability. The paper presents the system design, implementation and a thorough evaluation of the scalability of the fully implemented system.