A checkpoint protocol for an entry consistent shared memory system
PODC '94 Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing
A checkpoint protocol for an entry consistent shared memory system
PODC '94 Proceedings of the thirteenth annual ACM symposium on Principles of distributed computing
A comprehensive bibliography of distributed shared memory
ACM SIGOPS Operating Systems Review
Hi-index | 0.00 |
DiSOM is a software-based distributed shared memory system for a multicomputer composed of heterogeneous nodes connected by a high-speed, low latency network [Guedes 93]. The current prototype comprises a Sun SPARCCenter 2000 with 10 processors and several SPARCStation 10, i486 PC and Dec Alpha, connected by ATM and Ethernet.Programs in DiSOM are written using a shared-memory multiprocessor model where synchronization objects are explicitly associated with data items. Programs are composed of a set of parallel threads of execution. These threads share data objects and synchronize by explicit calls to system provided synchronization constructs. The system traps these calls and uses the information to drive both distributed synchronization and the memory coherence protocol. DiSOM uses the entry consistency memory model [Bershad 93] to ensure coherence. This model guarantees memory consistency, as long as an access to a data item is enclosed between an acquire and a release on the synchronization object associated with the data item.DiSOM addresses two issues we believe to be crucial in distributed shared memory systems: good performance, achieved by a close integration of the programming model with the synchronization mechanisms, and fault-tolerance, with an efficient checkpointing algorithm that requires no extra messages during the failure-free period.