Memory coherence in shared virtual memory systems
PODC '86 Proceedings of the fifth annual ACM symposium on Principles of distributed computing
Memory access patterns of parallel scientific programs
SIGMETRICS '87 Proceedings of the 1987 ACM SIGMETRICS conference on Measurement and modeling of computer systems
801 storage: architecture and programming
ACM Transactions on Computer Systems (TOCS)
Firefly: A Multiprocessor Workstation
IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Memory-reference characteristics of multiprocessor applications under MACH
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Persistent memory: a storage architecture for object-oriented database systems
OODS '86 Proceedings on the 1986 international workshop on Object-oriented database systems
Physical integrity in a large segmented database
ACM Transactions on Database Systems (TODS)
Fail-stop processors: an approach to designing fault-tolerant computing systems
ACM Transactions on Computer Systems (TOCS)
Optimizing Shadow Recovery Algorithms
IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering
Shared virtual memory on loosely coupled multiprocessors
Shared virtual memory on loosely coupled multiprocessors
Transparent optimistic rollback recovery
ACM SIGOPS Operating Systems Review
A virtual memory translation mechanism to support checkpoint and rollback recovery
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Virtual Checkpoints: Architecture and Performance
IEEE Transactions on Computers - Special issue on fault-tolerant computing
Manetho: Transparent Roll Back-Recovery with Low Overhead, Limited Rollback, and Fast Output Commit
IEEE Transactions on Computers - Special issue on fault-tolerant computing
A comprehensive bibliography of distributed shared memory
ACM SIGOPS Operating Systems Review
On distributed object checkpointing and recovery
Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
IEEE Transactions on Parallel and Distributed Systems
The role of data-race-free programs in recoverable DSM
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Consistent Global Checkpoints that Contain a Given Set of Local Checkpoints
IEEE Transactions on Computers
A Survey of Recoverable Distributed Shared Virtual Memory Systems
IEEE Transactions on Parallel and Distributed Systems
An Efficient and Scalable Approach for Implementing Fault-Tolerant DSM Architectures
IEEE Transactions on Computers
A Low Overhead Logging Scheme for Fast Recovery in Distributed Shared Memory Systems
The Journal of Supercomputing
Transparent optimistic rollback recovery
EW 4 Proceedings of the 4th workshop on ACM SIGOPS European workshop
A Fault Tolerant Hybrid Memory Structure and Memory Management Algorithms
IEEE Transactions on Computers
Rapid Transaction-Undo Recovery Using Twin-Page Storage Management
IEEE Transactions on Software Engineering
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Logging and Recovery in Adaptive Software Distributed Shared Memory Systems
SRDS '99 Proceedings of the 18th IEEE Symposium on Reliable Distributed Systems
A Recoverable Distributed Shared Memory Integrating Coherence and Recoverability
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Reduced Overhead Logging for Rollback Recovery in Distributed Shared Memory
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
An Efficient Logging Scheme for Lazy Release Consistent Distributed Shared Memory Systems
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
On Page-Based Optimistic Process Checkpointing
IWOOOS '95 Proceedings of the 4th International Workshop on Object-Orientation in Operating Systems
Quantifying rollback propagation in distributed checkpointing
Journal of Parallel and Distributed Computing
Integrating coherency and recoverability in distributed systems
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Transparent fault tolerance for parallel applications on networks of workstations
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
Design and implementation of an object-orientated 64-bit single address space microkernel
moas'93 USENIX Symposium on USENIX Microkernels and Other Kernel Architectures Symposium - Volume 4
Hi-index | 14.99 |
The problem of rollback recovery in distributed shared virtual environments, in which the shared memory is implemented in software in a loosely coupled distributed multicomputer system, is examined. A user-transparent checkpointing recovery scheme and a new twin-page disk storage management technique are presented for implementing recoverable distributed shared virtual memory. The checkpointing scheme can be integrated with the memory coherence protocol for managing the shared virtual memory. The twin-page disk design allows checkpointing to proceed in an incremental fashion without an explicit undo at the time of recovery. The recoverable distributed shared virtual memory allows the system to restart computation from a checkpoint without a global restart.