Exploiting virtual synchrony in distributed systems
SOSP '87 Proceedings of the eleventh ACM Symposium on Operating systems principles
Distributed programming in Argus
Communications of the ACM
ACM Transactions on Computer Systems (TOCS)
Replication and fault-tolerance in the ISIS system
Proceedings of the tenth ACM symposium on Operating systems principles
Preserving Abstraction in Concurrent Programming
IEEE Transactions on Software Engineering
SOSP '81 Proceedings of the eighth ACM symposium on Operating systems principles
A message system supporting fault tolerance
SOSP '83 Proceedings of the ninth ACM symposium on Operating systems principles
Elections in a Distributed Computing System
IEEE Transactions on Computers
Hi-index | 0.24 |
A distributed program is one that consists of several components distributed over a network of computers. The reliability of a distributed program is strongly affected by the behaviour of the underlying distributed system software platform. One of the most fundamental issues in improving the reliability of distributed programs is to provide a better environment within which these programs operate. This paper investigates the functional requirement of the infrastructure of a distributed operating system in anticipation of the goal of improving the reliability of distributed programs. It identifies the major problems that make the task of achieving high software quality hard to accomplish, then suggests possible functionality that is advisable for the system infrastructure to provide. The design and implementation of the new distributed system service are then described.