Fault-tolerant complex event processing using customizable state machine-based operators
Proceedings of the 15th International Conference on Extending Database Technology
Integrating scale out and fault tolerance in stream processing using operator state management
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Adaptive online scheduling in storm
Proceedings of the 7th ACM international conference on Distributed event-based systems
Proceedings Demo & Poster Track of ACM/IFIP/USENIX International Middleware Conference
Tutorial: Elastic and Fault Tolerant Event Stream Processing using StreamMine3G
UCC '13 Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing
Scalable and Real-Time Deep Packet Inspection
UCC '13 Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing
Hi-index | 0.00 |
MapReduce has become a popular programming paradigm in the domain of batch processing systems. Its simplicity allows applications to be highly scalable and to be easily deployed on large clusters. More recently, the MapReduce approach has been also applied to Event Stream Processing (ESP) systems. This approach, which we call StreamMapReduce, enabled many novel applications that require both scalability and low latency. Another recent trend is to move distributed applications to public clouds such as Amazon EC2 rather than running and maintaining private data centers. Most cloud providers charge their customers on an hourly basis rather than on CPU cycles consumed. However, many applications, especially those that process online data, need to limit their CPU utilization to conservative levels (often as low as $50\%$) to be able to accommodate natural and sudden load variations without causing unacceptable deterioration in responsiveness. In this paper, we present a new fault tolerance approach based on active replication for StreamMapReduce systems. This approach is cost effective for cloud consumers as well as cloud providers. Cost effectiveness is achieved by fully utilizing the acquired computational resources without performance degradation and by reducing the need for additional nodes dedicated to fault tolerance.