Reliable communication in the presence of failures
ACM Transactions on Computer Systems (TOCS)
Consensus in the presence of partial synchrony
Journal of the ACM (JACM)
Linearizability: a correctness condition for concurrent objects
ACM Transactions on Programming Languages and Systems (TOPLAS)
Impossibility of distributed consensus with one faulty process
Journal of the ACM (JACM)
Unreliable failure detectors for reliable distributed systems
Journal of the ACM (JACM)
Totem: a fault-tolerant multicast group communication system
Communications of the ACM
The weakest failure detector for solving consensus
Journal of the ACM (JACM)
Efficient message ordering in dynamic networks
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Principles of transaction processing: for the systems professional
Principles of transaction processing: for the systems professional
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
IEEE Transactions on Software Engineering
Implementing E-Transactions with Asynchronous Replication
IEEE Transactions on Parallel and Distributed Systems
Group communication specifications: a comprehensive study
ACM Computing Surveys (CSUR)
Evaluating the running time of a communication round over the internet
Proceedings of the twenty-first annual symposium on Principles of distributed computing
Distributed Application Development for Three-Tier Architectures: Microsoft on Windows DNA
IEEE Internet Computing
The Timely Computing Base Model and Architecture
IEEE Transactions on Computers
he Timely Computing Base: Timely Actions in the Presence of Uncertain Timeliness
DSN '00 Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8)
Asynchronous Active Replication in Three-Tier Distributed Systems
PRDC '02 Proceedings of the 2002 Pacific Rim International Symposium on Dependable Computing
Fast Replicated State Machines Over Partitionable Networks
SRDS '97 Proceedings of the 16th Symposium on Reliable Distributed Systems
Fault Tolerance in Three-Tier Applications: Focusing on the Database Tier
SRDS '99 Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems
Efficient Byzantine-Resilient Reliable Multicast on a Hybrid Failure Model
SRDS '02 Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems
A Suite of Database Replication Protocols based on Group Communication Primitives
ICDCS '98 Proceedings of the The 18th International Conference on Distributed Computing Systems
Unification of Replication and Transaction Processing in Three-Tier Architectures
ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
End-to-End Transactions in Three-Tier Systems
DOA '01 Proceedings of the Third International Symposium on Distributed Objects and Applications
Software Replication in Three-Tiers Architectures: Is It A Real Challenge?
FTDCS '01 Proceedings of the 8th IEEE Workshop on Future Trends of Distributed Computing Systems
Separating agreement from execution for byzantine fault tolerant services
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Transparent autonomization in CORBA
Computer Networks: The International Journal of Computer and Telecommunications Networking
A platform for cooperative server backups based on virtual machines
ISAS'08 Proceedings of the 5th international conference on Service availability
Hi-index | 0.00 |
Keeping strongly consistent the state of the replicas of a software service deployed across a distributed system prone to crashes and with highly unstable message transfer delays (e.g., the Internet), is a real practical challenge. The solution to this problem is subject to the FLP impossibility result, and thus there is a need for "long enough” periods of synchrony with time bounds on process speeds and message transfer delays to ensure deterministic termination of any run of agreement protocols executed by replicas. This behavior can be abstracted by a partially synchronous computational model. In this setting, before reaching a period of synchrony, the underlying network can arbitrarily delay messages and these delays can be perceived as false failures by some timeout-based failure detection mechanism leading to unexpected service unavailability. This paper proposes a fully distributed solution for active software replication based on a three-tier software architecture well-suited to such a difficult setting. The formal correctness of the solution is proved by assuming the middle-tier runs in a partially synchronous distributed system. This architecture separates the ordering of the requests coming from clients, executed by the middle-tier, from their actual execution, done by replicas, i.e., the end-tier. In this way, clients can show up in any part of the distributed system and replica placement is simplified, since only the middle-tier has to be deployed on a well-behaving part of the distributed system that frequently respects synchrony bounds. This deployment permits a rapid timeout tuning reducing thus unexpected service unavailability.