Totem: a fault-tolerant multicast group communication system
Communications of the ACM
The Totem multiple-ring ordering and topology maintenance protocol
ACM Transactions on Computer Systems (TOCS)
Dealing efficiently with data-center disasters
Journal of Parallel and Distributed Computing
A gossip-style failure detection service
Middleware '98 Proceedings of the IFIP International Conference on Distributed Systems Platforms and Open Distributed Processing
Hi-index | 0.02 |
We present-the Totem multiple-ring protocol, a novel reliable ordered multicast protocol for multiple interconnected local-area networks. The protocol exhibits excellent performance and maintains a consistent network-wide total order of messages despite network partitioning and remerging, or processor failure and recovery with stable storage intact. The Totem protocol is designed for fault-tolerant distributed systems, which replicate data to guard against failures and must ensure that replicated data remain consistent despite failures. The network-wide total order of messages provided by Totem simplifies the maintenance of consistency of replicated data, and, thus, eases the development of fault-tolerant distributed systems