Axioms for memory access in asynchronous hardware systems
ACM Transactions on Programming Languages and Systems (TOPLAS) - The MIT Press scientific computation series
Reliable communication in the presence of failures
ACM Transactions on Computer Systems (TOCS)
On the minimal synchronism needed for distributed consensus
Journal of the ACM (JACM)
Concurrency control and recovery in database systems
Concurrency control and recovery in database systems
Consensus in the presence of partial synchrony
Journal of the ACM (JACM)
Linearizability: a correctness condition for concurrent objects
ACM Transactions on Programming Languages and Systems (TOPLAS)
Implementing fault-tolerant services using the state machine approach: a tutorial
ACM Computing Surveys (CSUR)
Impossibility of distributed consensus with one faulty process
Journal of the ACM (JACM)
Unreliable failure detectors for reliable distributed systems
Journal of the ACM (JACM)
The weakest failure detector for solving consensus
Journal of the ACM (JACM)
ACM Transactions on Computer Systems (TOCS)
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Group communication specifications: a comprehensive study
ACM Computing Surveys (CSUR)
Distributed Algorithms
A survey of rollback-recovery protocols in message-passing systems
ACM Computing Surveys (CSUR)
Dependability: Basic Concepts and Terminology
Dependability: Basic Concepts and Terminology
SIGMOD '81 Proceedings of the 1981 ACM SIGMOD international conference on Management of data
Fast Asynchronous Uniform Consensus in Real-Time Distributed Systems
IEEE Transactions on Computers
Heartbeat: A Timeout-Free Failure Detector for Quiescent Reliable Communication
WDAG '97 Proceedings of the 11th International Workshop on Distributed Algorithms
DISC '00 Proceedings of the 14th International Conference on Distributed Computing
Reducing the cost for non-blocking in atomic commitment
ICDCS '96 Proceedings of the 16th International Conference on Distributed Computing Systems (ICDCS '96)
A Modular Approach to Fault-Tolerant Broadcasts and Related Problems
A Modular Approach to Fault-Tolerant Broadcasts and Related Problems
Brief announcement: dynamic group communication
Proceedings of the twenty-second annual symposium on Principles of distributed computing
Handling message semantics with Generic Broadcast protocols
Distributed Computing
Total order broadcast and multicast algorithms: Taxonomy and survey
ACM Computing Surveys (CSUR)
From Set Membership to Group Membership: A Separation of Concerns
IEEE Transactions on Dependable and Secure Computing
How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs
IEEE Transactions on Computers
What service replication middleware can learn from object replication middleware
Proceedings of the 1st workshop on Middleware for Service Oriented Computing (MW4SOC 2006)
Deterministic Models of Communication Faults
MFCS '08 Proceedings of the 33rd international symposium on Mathematical Foundations of Computer Science
Hi-index | 0.00 |
Improving the dependability of computer systems is a critical and essential task. In this context, the paper surveys techniques that allow to achieve fault tolerance in distributed systems by replication. The main replication techniques are first explained. Then group communication is introduced as the communication infrastructure that allows the implementation of the different replication techniques. Finally the difficulty of implementing group communication is discussed, and the most important algorithms are presented.