Information Processing Letters
Detection of stable properties in distributed applications
PODC '87 Proceedings of the sixth annual ACM Symposium on Principles of distributed computing
Recovery in distributed systems using optimistic message logging and check-pointing
Journal of Algorithms
Necessary and Sufficient Conditions for Consistent Global Snapshots
IEEE Transactions on Parallel and Distributed Systems
Distributed snapshots: determining global states of distributed systems
ACM Transactions on Computer Systems (TOCS)
Journal of Systems and Software - Special issue on software engineering for distributed computing
Consistent Global Checkpoints that Contain a Given Set of Local Checkpoints
IEEE Transactions on Computers
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Finding Consistent Global Checkpoints in a Distributed Computation
IEEE Transactions on Parallel and Distributed Systems
Efficient Message Logging for Uncoordinated Checkpointing Protocols
EDCC-2 Proceedings of the Second European Dependable Computing Conference on Dependable Computing
A Communication-Induced Checkpointing Protocol that Ensures Rollback-Dependency Trackability
FTCS '97 Proceedings of the 27th International Symposium on Fault-Tolerant Computing (FTCS '97)
Communication-Induced Determination of Consistent Snapshots
FTCS '98 Proceedings of the The Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing
Replaying Distributed Programs Without Message Logging
HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing
Detecting causal relationships in distributed computations: in search of the holy grail
Distributed Computing
Communication-Induced Determination of Consistent Snapshots
IEEE Transactions on Parallel and Distributed Systems
Tracking immediate predecessors in distributed computations
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Interval consistency of asynchronous distributed computations
Journal of Computer and System Sciences
An Efficient Coordinated Checkpointing Scheme Based on PWD Model
ICOIN '02 Revised Papers from the International Conference on Information Networking, Wireless Communications Technologies and Network Applications-Part II
Protocol for Taking Object-Based Checkpoints
DEXA '00 Proceedings of the 11th International Conference on Database and Expert Systems Applications
Quasi-atomic recovery for distributed agents
Parallel Computing
CPPC-G: fault-tolerant applications on the grid
PPAM'07 Proceedings of the 7th international conference on Parallel processing and applied mathematics
Performance evaluation of an application-level checkpointing solution on grids
Future Generation Computer Systems
Using computing checkpoints implement consistent low-cost non-blocking coordinated checkpointing
PDCAT'04 Proceedings of the 5th international conference on Parallel and Distributed Computing: applications and Technologies
From the Happened-Before Relation to the Causal Ordered Set Abstraction
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
A global checkpoint is a set of local checkpoints, one per process. The traditional consistency criterion for global checkpoints states that a global checkpoint is consistent if it does not include messages received and not sent. This paper investigates other consistency criteria, transitlessness, and strong consistency. A global checkpoint is transitless if it does not exhibit messages sent and not received. Transitlessness can be seen as a dual of traditional consistency. Strong consistency is the addition of transitlessness to traditional consistency. The main result of this paper is a statement of the necessary and sufficient condition answering the following question: "Given an arbitrary set of local checkpoints, can this set be extended to a global checkpoint that satisfies$\cal P$" (where $\cal P$ is traditional consistency, transitlessness, or strong consistency). From a practical point of view, this condition, when applied to transitlessness, is particularly interesting as it helps characterize which messages do not need to be recorded by checkpointing protocols.