Total order broadcast and multicast algorithms: Taxonomy and survey
ACM Computing Surveys (CSUR)
MIDDLE-R: Consistent database replication at the middleware level
ACM Transactions on Computer Systems (TOCS)
A general characterization of indulgence
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Symmetric active/active metadata service for high availability parallel file systems
Journal of Parallel and Distributed Computing
A general characterization of indulgence
SSS'06 Proceedings of the 8th international conference on Stabilization, safety, and security of distributed systems
Unconscious eventual consistency with gossips
SSS'06 Proceedings of the 8th international conference on Stabilization, safety, and security of distributed systems
On the inherent cost of atomic broadcast and multicast in wide area networks
ICDCN'08 Proceedings of the 9th international conference on Distributed computing and networking
Towards robust optimistic approaches
Future directions in distributed computing
Throughput optimal total order broadcast for cluster environments
ACM Transactions on Computer Systems (TOCS)
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Rewriting: sleeping to get there faster
HotDep'05 Proceedings of the First conference on Hot topics in system dependability
Asynchronous lease-based replication of software transactional memory
Proceedings of the ACM/IFIP/USENIX 11th International Conference on Middleware
Low-latency atomic broadcast in the presence of contention
DISC'06 Proceedings of the 20th international conference on Distributed Computing
Consistent data replication: is it feasible in WANs?
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Towards a generic group communication service
ODBASE'06/OTM'06 Proceedings of the 2006 Confederated international conference on On the Move to Meaningful Internet Systems: CoopIS, DOA, GADA, and ODBASE - Volume Part II
Practical database replication
Replication
Modeling and validating the performance of atomic broadcast algorithms in high latency networks
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Hi-index | 0.00 |
A total order algorithm is a fundamental building block in the construction of distributed fault-tolerant applications. Unfortunately, the implementation of such a primitive can be expensive both in terms of communication steps and of number of messages exchanged. This problem is exacerbated in large-scale systems, where the performance of the algorithm may be limited by the presence of high-latency links. Typically, the most efficient total order algorithms do not provide uniform delivery and assume the availability of a perfect failure detector. Such algorithms may provide inconsistent results if the system assumptions do not hold. On the other hand, algorithms that assume an unreliable failure detector always provide consistent results but exhibit higher costs. This paper presents a new algorithm that combines the advantages of both approaches. On good periods, when the system is stable and processes are not suspected, the algorithm operates as if a perfect failure detector is assumed. Yet, the algorithm is indulgent, since it never violates consistency, even in runs where processes are suspected.