Replica placement for high availability in distributed stream processing systems

Authors:
Thomas Repantis;Vana Kalogeraki
Affiliations:
University of California, Riverside, CA;University of California, Riverside, CA
Venue:
Proceedings of the second international conference on Distributed event-based systems
Year:
2008

Citing 30
Cited 1

Implementing fault-tolerant services using the state machine approach: a tutorial

ACM Computing Surveys (CSUR)
The process group approach to reliable distributed computing

Communications of the ACM
e-Transactions: End-to-End Reliability for Three-Tier Architectures

IEEE Transactions on Software Engineering
Surviving Network Partitioning

Computer
AQuA: An Adaptive Architecture that Provides Dependable Distributed Objects

IEEE Transactions on Computers
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
The design of a CORBA group communication service

SRDS '96 Proceedings of the 15th Symposium on Reliable Distributed Systems
Understanding Replication in Databases and Distributed Systems

ICDCS '00 Proceedings of the The 20th International Conference on Distributed Computing Systems ( ICDCS 2000)
Experiences, Strategies, and Challenges in Building Fault-Tolerant CORBA Systems

IEEE Transactions on Computers
Highly available, fault-tolerant, parallel dataflows

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Towards Real-Time Fault-Tolerant CORBA Middleware

Cluster Computing
High-Availability Algorithms for Distributed Stream Processing

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Farsite: federated, available, and reliable storage for an incompletely trusted environment

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Fault-tolerance in the Borealis distributed stream processing system

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Meridian: a lightweight network location service without virtual coordinates

Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
BAR fault tolerance for cooperative services

Proceedings of the twentieth ACM symposium on Operating systems principles
Fault-tolerance for Stateful Application Servers in the Presence of Advanced Transactions Patterns

SRDS '05 Proceedings of the 24th IEEE Symposium on Reliable Distributed Systems
Thema: Byzantine-Fault-Tolerant Middleware forWeb-Service Applications

SRDS '05 Proceedings of the 24th IEEE Symposium on Reliable Distributed Systems
MIDDLE-R: Consistent database replication at the middleware level

ACM Transactions on Computer Systems (TOCS)
Network-Aware Operator Placement for Stream-Processing Systems

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Operating system support for planetary-scale network services

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Availability of multi-object operations

NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Latency and bandwidth-minimizing failure detectors

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Optimal inter-object correlation when replicating for availability

Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Network-aware query processing for stream-based applications

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Challenges and experience in prototyping a multi-modal stream analytic and monitoring application on System S

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
DBFarm: a scalable cluster for multiple databases

Proceedings of the ACM/IFIP/USENIX 2006 International Conference on Middleware
Synergy: sharing-aware component composition for distributed stream processing systems

Proceedings of the ACM/IFIP/USENIX 2006 International Conference on Middleware
Utility-driven proactive management of availability in enterprise-scale information flows

Proceedings of the ACM/IFIP/USENIX 2006 International Conference on Middleware
An adaptive quality of service aware middleware for replicated services

IEEE Transactions on Parallel and Distributed Systems

Placement of replicated tasks for distributed stream processing systems

Proceedings of the Fourth ACM International Conference on Distributed Event-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

A significant number of emerging on-line data analysis applications require the processing of data streams, large amounts of data that get updated continuously, to generate outputs of interest or to identify meaningful events. Example domains include network traffic management, stock price monitoring, customized e-commerce websites, and analysis of sensor data. In this paper we look at the problem of high availability in such a distributed stream processing system. By taking into account the particular characteristics of stream processing applications we first identify design principles for a replica placement algorithm for high availability. We incorporate these principles in a decentralized replica placement protocol that aims to maximize availability, while respecting resource constraints, and making performance-aware placement decisions. We have integrated our replica placement protocol in Synergy, our distributed stream processing middleware. Our experimental comparison over PlanetLab with the current state of the art corroborates our claims that our techniques maximize availability while sustaining good performance.