Using Time Instead of Timeout for Fault-Tolerant Distributed Systems.
ACM Transactions on Programming Languages and Systems (TOPLAS)
On the reliability of consensus-based fault-tolerant distributed computing systems
ACM Transactions on Computer Systems (TOCS)
Mirage: a coherent distributed shared memory design
SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
Memory coherence in shared virtual memory systems
ACM Transactions on Computer Systems (TOCS)
Fail-stop processors: an approach to designing fault-tolerant computing systems
ACM Transactions on Computer Systems (TOCS)
End-to-end arguments in system design
ACM Transactions on Computer Systems (TOCS)
Monitors: an operating system structuring concept
Communications of the ACM
Operating system principles
LOCUS a network transparent, high reliability distributed system
SOSP '81 Proceedings of the eighth ACM symposium on Operating systems principles
Distributed shared memory in a loosely-coupled environment
Distributed shared memory in a loosely-coupled environment
Hi-index | 0.00 |
The issue of support for fault-tolerant distributed systems has received much attention in recent years[BABA87, LAMP84, SCHL83]. In this position paper we present some aspects of our research into Distributed Shared Memory systems which concern fault-tolerance. We argue that the viability of DSM systems critically depends on the issue.Our research, which began at the University of California, Los Angeles, was concerned with applications that used shared memory in single site systems and their extension to operate in a distributed environment. Our approach was to modify the underlying operating system to support a new facility called distributed shared memory (DSM).