Reliable communication in the presence of failures
ACM Transactions on Computer Systems (TOCS)
Performance of the world's fastest distributed operating system
ACM SIGOPS Operating Systems Review
An efficient reliable broadcast protocol
ACM SIGOPS Operating Systems Review
Fault tolerance using group communication
EW 4 Proceedings of the 4th workshop on ACM SIGOPS European workshop
Fault tolerance support in distributed systems: every silver lining has a cloud
EW 4 Proceedings of the 4th workshop on ACM SIGOPS European workshop
Experience with a commercial Java implementation of group communication using reliable multicast
ACM SIGOPS Operating Systems Review
Hi-index | 0.00 |
A practical basis for implementing fault tolerant applications is described. At the application programming level a set of group communication and transaction handling primitives is provided. A fault tolerant globally--ordered broadcast service provides the basis for those primitives.The design is based on Kaashoek & Tanenbaum's reliable broadcast [4], [5], but modified, and implemented on a standard Unix system using the industry--standard TCP/IP protocol family.The implementation of the broadcast service is simple, and could be implemented at a low level to provide a new type of reliable LAN, or it can be implemented by using spare cycles on server machines which are also performing other tasks. The implementation of the group primitives using the broadcast service is not so simple, and is indicated, but not described in detail.Code size and performance figures are given for an implementation of the system.