Group communication: from practice to theory

Authors:
André Schiper
Affiliations:
Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
Venue:
SOFSEM'06 Proceedings of the 32nd conference on Current Trends in Theory and Practice of Computer Science
Year:
2006

Citing 27
Cited 2

Axioms for memory access in asynchronous hardware systems

ACM Transactions on Programming Languages and Systems (TOPLAS) - The MIT Press scientific computation series
Reliable communication in the presence of failures

ACM Transactions on Computer Systems (TOCS)
On the minimal synchronism needed for distributed consensus

Journal of the ACM (JACM)
Concurrency control and recovery in database systems

Concurrency control and recovery in database systems
Consensus in the presence of partial synchrony

Journal of the ACM (JACM)
Linearizability: a correctness condition for concurrent objects

ACM Transactions on Programming Languages and Systems (TOPLAS)
Implementing fault-tolerant services using the state machine approach: a tutorial

ACM Computing Surveys (CSUR)
Impossibility of distributed consensus with one faulty process

Journal of the ACM (JACM)
Unreliable failure detectors for reliable distributed systems

Journal of the ACM (JACM)
The weakest failure detector for solving consensus

Journal of the ACM (JACM)
The part-time parliament

ACM Transactions on Computer Systems (TOCS)
Time, clocks, and the ordering of events in a distributed system

Communications of the ACM
Group communication specifications: a comprehensive study

ACM Computing Surveys (CSUR)
Distributed Algorithms

Distributed Algorithms
A survey of rollback-recovery protocols in message-passing systems

ACM Computing Surveys (CSUR)
Dependability: Basic Concepts and Terminology

Dependability: Basic Concepts and Terminology
Nonblocking commit protocols

SIGMOD '81 Proceedings of the 1981 ACM SIGMOD international conference on Management of data
Fast Asynchronous Uniform Consensus in Real-Time Distributed Systems

IEEE Transactions on Computers
Heartbeat: A Timeout-Free Failure Detector for Quiescent Reliable Communication

WDAG '97 Proceedings of the 11th International Workshop on Distributed Algorithms
Thrifty Generic Broadcast

DISC '00 Proceedings of the 14th International Conference on Distributed Computing
Reducing the cost for non-blocking in atomic commitment

ICDCS '96 Proceedings of the 16th International Conference on Distributed Computing Systems (ICDCS '96)
A Modular Approach to Fault-Tolerant Broadcasts and Related Problems

A Modular Approach to Fault-Tolerant Broadcasts and Related Problems
Brief announcement: dynamic group communication

Proceedings of the twenty-second annual symposium on Principles of distributed computing
Handling message semantics with Generic Broadcast protocols

Distributed Computing
Total order broadcast and multicast algorithms: Taxonomy and survey

ACM Computing Surveys (CSUR)
From Set Membership to Group Membership: A Separation of Concerns

IEEE Transactions on Dependable and Secure Computing
How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs

IEEE Transactions on Computers

What service replication middleware can learn from object replication middleware

Proceedings of the 1st workshop on Middleware for Service Oriented Computing (MW4SOC 2006)
Deterministic Models of Communication Faults

MFCS '08 Proceedings of the 33rd international symposium on Mathematical Foundations of Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

Improving the dependability of computer systems is a critical and essential task. In this context, the paper surveys techniques that allow to achieve fault tolerance in distributed systems by replication. The main replication techniques are first explained. Then group communication is introduced as the communication infrastructure that allows the implementation of the different replication techniques. Finally the difficulty of implementing group communication is discussed, and the most important algorithms are presented.