Fully Distributed Three-Tier Active Software Replication

Authors:
Carlo Marchetti;Roberto Baldoni;Sara Tucci-Piergiovanni;Antonino Virgillito
Affiliations:
-;-;-;-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
2006

Citing 27
Cited 2

Reliable communication in the presence of failures

ACM Transactions on Computer Systems (TOCS)
Consensus in the presence of partial synchrony

Journal of the ACM (JACM)
Linearizability: a correctness condition for concurrent objects

ACM Transactions on Programming Languages and Systems (TOPLAS)
Impossibility of distributed consensus with one faulty process

Journal of the ACM (JACM)
Unreliable failure detectors for reliable distributed systems

Journal of the ACM (JACM)
Totem: a fault-tolerant multicast group communication system

Communications of the ACM
The weakest failure detector for solving consensus

Journal of the ACM (JACM)
Efficient message ordering in dynamic networks

PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Principles of transaction processing: for the systems professional

Principles of transaction processing: for the systems professional
Time, clocks, and the ordering of events in a distributed system

Communications of the ACM
The Generic Consensus Service

IEEE Transactions on Software Engineering
Implementing E-Transactions with Asynchronous Replication

IEEE Transactions on Parallel and Distributed Systems
Group communication specifications: a comprehensive study

ACM Computing Surveys (CSUR)
Evaluating the running time of a communication round over the internet

Proceedings of the twenty-first annual symposium on Principles of distributed computing
Distributed Application Development for Three-Tier Architectures: Microsoft on Windows DNA

IEEE Internet Computing
Software-Based Replication for Fault Tolerance

Computer
The Timely Computing Base Model and Architecture

IEEE Transactions on Computers
he Timely Computing Base: Timely Actions in the Presence of Uncertain Timeliness

DSN '00 Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8)
Asynchronous Active Replication in Three-Tier Distributed Systems

PRDC '02 Proceedings of the 2002 Pacific Rim International Symposium on Dependable Computing
Fast Replicated State Machines Over Partitionable Networks

SRDS '97 Proceedings of the 16th Symposium on Reliable Distributed Systems
Fault Tolerance in Three-Tier Applications: Focusing on the Database Tier

SRDS '99 Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems
Efficient Byzantine-Resilient Reliable Multicast on a Hybrid Failure Model

SRDS '02 Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems
A Suite of Database Replication Protocols based on Group Communication Primitives

ICDCS '98 Proceedings of the The 18th International Conference on Distributed Computing Systems
Unification of Replication and Transaction Processing in Three-Tier Architectures

ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
End-to-End Transactions in Three-Tier Systems

DOA '01 Proceedings of the Third International Symposium on Distributed Objects and Applications
Software Replication in Three-Tiers Architectures: Is It A Real Challenge?

FTDCS '01 Proceedings of the 8th IEEE Workshop on Future Trends of Distributed Computing Systems
Separating agreement from execution for byzantine fault tolerant services

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles

Transparent autonomization in CORBA

Computer Networks: The International Journal of Computer and Telecommunications Networking
A platform for cooperative server backups based on virtual machines

ISAS'08 Proceedings of the 5th international conference on Service availability

Quantified Score

Hi-index	0.00

Visualization

Abstract

Keeping strongly consistent the state of the replicas of a software service deployed across a distributed system prone to crashes and with highly unstable message transfer delays (e.g., the Internet), is a real practical challenge. The solution to this problem is subject to the FLP impossibility result, and thus there is a need for "long enough” periods of synchrony with time bounds on process speeds and message transfer delays to ensure deterministic termination of any run of agreement protocols executed by replicas. This behavior can be abstracted by a partially synchronous computational model. In this setting, before reaching a period of synchrony, the underlying network can arbitrarily delay messages and these delays can be perceived as false failures by some timeout-based failure detection mechanism leading to unexpected service unavailability. This paper proposes a fully distributed solution for active software replication based on a three-tier software architecture well-suited to such a difficult setting. The formal correctness of the solution is proved by assuming the middle-tier runs in a partially synchronous distributed system. This architecture separates the ordering of the requests coming from clients, executed by the middle-tier, from their actual execution, done by replicas, i.e., the end-tier. In this way, clients can show up in any part of the distributed system and replica placement is simplified, since only the middle-tier has to be deployed on a well-behaving part of the distributed system that frequently respects synchrony bounds. This deployment permits a rapid timeout tuning reducing thus unexpected service unavailability.