801 storage: architecture and programming
ACM Transactions on Computer Systems (TOCS)
ACM Transactions on Computer Systems (TOCS)
Sheaved memory: architectural support for state saving and restoration in pages systems
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
IGOR: a system for program debugging via reversible execution
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Supporting reverse execution for parallel programs
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Recoverable Distributed Shared Virtual Memory
IEEE Transactions on Computers
Real-time, concurrent checkpoint for parallel programs
PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
Virtual Checkpoints: Architecture and Performance
IEEE Transactions on Computers - Special issue on fault-tolerant computing
Persistent memory: a storage architecture for object-oriented database systems
OODS '86 Proceedings on the 1986 international workshop on Object-oriented database systems
Physical integrity in a large segmented database
ACM Transactions on Database Systems (TODS)
Error Recovery in Shared Memory Multiprocessors Using Private Caches
IEEE Transactions on Parallel and Distributed Systems
SOSP '81 Proceedings of the eighth ACM symposium on Operating systems principles
A new checkpoint mechanism for real time operating systems
ACM SIGOPS Operating Systems Review
Design of Multi-Invariant Data Structures for Robust Shared Accesses in Multiprocessor Systems
IEEE Transactions on Software Engineering
Analysis of Checkpointing for Real-Time Systems
Real-Time Systems
Increasing relevance of memory hardware errors: a case for recoverable programming models
EW 9 Proceedings of the 9th workshop on ACM SIGOPS European workshop: beyond the PC: new challenges for the operating system
Transient Fault Tolerance in Digital Systems
IEEE Micro
Checkpointing for Optimistic Concurrency Control Methods
IEEE Transactions on Knowledge and Data Engineering
Complementary use of runtime validation and model checking
ICCAD '05 Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design
JVM susceptibility to memory errors
JVM'01 Proceedings of the 2001 Symposium on JavaTM Virtual Machine Research and Technology Symposium - Volume 1
Fast memory snapshot for concurrent programmingwithout synchronization
Proceedings of the 23rd international conference on Supercomputing
A low-cost SEE mitigation solution for soft-processors embedded in systems on programmable chips
Proceedings of the Conference on Design, Automation and Test in Europe
Review: A survey of memory error correcting techniques for improved reliability
Journal of Network and Computer Applications
Specification and synthesis of hardware checkpointing and rollback mechanisms
Proceedings of the 49th Annual Design Automation Conference
An Architecture for High Availability Multi-user Systems
Computer Communications
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Hi-index | 4.11 |
Several hardware-based techniques that support checkpoint and rollback recovery are presented. The focus is on hardware schemes for uniprocessors, shared-memory multiprocessors, and distributed virtual-memory systems. A taxonomy for processor and memory techniques based on the memory hierarchy is presented. This provides a basis for understanding subtle differences among the various schemes. Processor-based schemes that handle transient faults by using processor-based transparent rollback techniques and memory-based schemes that roll back data instead of instructions and can be integrated with the processor techniques or can be exploited by higher levels of software are discussed.