Checkpointing and Rollback-Recovery for Distributed Systems
IEEE Transactions on Software Engineering - Special issue on distributed systems
Mirage: a coherent distributed shared memory design
SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
Memory Access Dependencies in Shared-Memory Multiprocessors
IEEE Transactions on Software Engineering
Recoverable Distributed Shared Virtual Memory
IEEE Transactions on Computers
Distributed Shared Memory: A Survey of Issues and Algorithms
Computer - Distributed computing systems: separate resources acting as one
Lightweight recoverable virtual memory
ACM Transactions on Computer Systems (TOCS) - Special issue on operating systems principles
Mirage+: a kernel implementation of distributed shared memory on a network of personal computers
Software—Practice & Experience
A checkpoint protocol for an entry consistent shared memory system
PODC '94 Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing
The directory-based cache coherence protocol for the DASH multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Advanced Concepts in Operating Systems
Advanced Concepts in Operating Systems
Operating System Concepts, 4th Ed.
Operating System Concepts, 4th Ed.
ickp: A Consistent Checkpointer for Multicomputers
IEEE Parallel & Distributed Technology: Systems & Technology
A memory approach to consistent, reliable distributed shared memory
HOTOS '95 Proceedings of the Fifth Workshop on Hot Topics in Operating Systems (HotOS-V)
A LONGITUDINAL SURVEY OF INTERNET HOST RELIABILITY
A LONGITUDINAL SURVEY OF INTERNET HOST RELIABILITY
Fault Tolerance and Configurability in DSM Coherence Protocols
IEEE Concurrency
Protocols for Fault-Tolerant Distributed-Shared-Memory on the SOME-Bus Multiprocessor Architecture
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Fault-Tolerant Distributed-Shared-Memory on a Broadcast-Based Interconnection Network
IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
A case for virtual distributed objects
Virtual shared memory for distributed architectures
Fault-Tolerant Distributed Shared Memory on a Broadcast-Based Architecture
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
DSM coherence protocols should scale well for large networks. Fault-tolerance in terms of highly available data access and uninterrupted DSM service is needed in large-scale environments that have a greater number of potentially malfunctioning components. We present a new class of dynamic coherence protocols for DSM systems in error-prone networks whose instances offer highly available access to DSM data at low operation costs. The approach is based on the highly scalable Boundary-Restricted (BR) coherence protocol class. The new protocol class, called the Dynamic Boundary-Restricted (DBR) coherence protocol class, maintains read/write frequencies of DSM requests at run-time. This information is used to dynamically adjust the minimum number of cached copies of a single DSM page in order to guarantee a given degree of data availability. The description of the new protocol class is accompanied by an analysis covering a large variety of workloads. This analysis presents the overall savings achieved by using a DBR coherence protocol in comparison to a static BR protocol.