Preserving and using context information in interprocess communication
ACM Transactions on Computer Systems (TOCS)
Unreliable failure detectors for asynchronous systems (preliminary version)
PODC '91 Proceedings of the tenth annual ACM symposium on Principles of distributed computing
The ISIS project: real experience with a fault tolerant programming system
ACM SIGOPS Operating Systems Review
Lightweight causal and atomic group multicast
ACM Transactions on Computer Systems (TOCS)
The process group approach to reliable distributed computing
Communications of the ACM
Impossibility of distributed consensus with one faulty process
Journal of the ACM (JACM)
Distributed process groups in the V Kernel
ACM Transactions on Computer Systems (TOCS)
RELACS: A Communications Infrastructure for Constructing Reliable Applications in Large-Scale Distributed Systems
On the impossibility of group membership
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
How to avoid the cost of causal communication in large-scale systems
EW 6 Proceedings of the 6th workshop on ACM SIGOPS European workshop: Matching operating systems to application needs
EW 7 Proceedings of the 7th workshop on ACM SIGOPS European workshop: Systems support for worldwide applications
Structured virtual synchrony: exploring the bounds of virtual synchronous group communication
EW 7 Proceedings of the 7th workshop on ACM SIGOPS European workshop: Systems support for worldwide applications
Jgroup-ARM: a distributed object group platform with autonomous replication management
Software—Practice & Experience
Extended membership problem for open groups: specification and solution
VECPAR'04 Proceedings of the 6th international conference on High Performance Computing for Computational Science
Hi-index | 0.00 |
An increasing number of applications with reliability requirements are being deployed in distributed systems that span large geographic distances or manage large numbers of objects. We consider the process group mechanism as an appropriate application structuring paradigm in such large-scale distributed systems. We give a formal characterization for the attribute "large scale" as applied to distributed systems and examine the technical problems that need to be solved in making group technology scalable. Our design advocates multiple roles for group membership over a minimal set of abstractions and primitives. The design is currently being implemented on top of "off-the-shelf" technologies for both communication and computation.