Optimistic recovery in distributed systems
ACM Transactions on Computer Systems (TOCS)
Unreliable failure detectors for reliable distributed systems
Journal of the ACM (JACM)
Quasi-Synchronous Checkpointing: Models, Characterization, and Classification
IEEE Transactions on Parallel and Distributed Systems
Fail-stop processors: an approach to designing fault-tolerant computing systems
ACM Transactions on Computer Systems (TOCS)
Straightforward Java Persistence Through Checkpointing
Proceedings of the 8th International Workshop on Persistent Object Systems (POS8) and Proceedings of the 3rd International Workshop on Persistence and Java (PJW3): Advances in Persistent Object Systems
Automated application-level checkpointing of MPI programs
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Portable Checkpointing for Heterogeneous Archtitectures
FTCS '97 Proceedings of the 27th International Symposium on Fault-Tolerant Computing (FTCS '97)
An Analysis of Communication-Induced Checkpointing
FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
A low-overhead recovery technique using quasi-synchronous checkpointing
ICDCS '96 Proceedings of the 16th International Conference on Distributed Computing Systems (ICDCS '96)
Asynchronous and deterministic objects
Proceedings of the 31st ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Synchronous, asynchronous, and causally ordered communication
Distributed Computing
Libckpt: transparent checkpointing under Unix
TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings
Promised messages: recovering from inconsistent global states
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Peer-to-Peer and fault-tolerance: Towards deployment-based technical services
Future Generation Computer Systems
Hi-index | 0.00 |
Communication Induced Checkpointing protocols usually make the assumption that any process can be checkpointed at any time. We propose an alternative approach which releases the constraint of always checkpointable processes, without delaying any message reception nor altering message ordering enforced by the communication layer or by the application. This protocol has been implemented within ProActive, an open source Java middleware for asynchronous and distributed objects implementing the ASP (Asynchronous Sequential Processes) model.