Optimistic recovery in distributed systems
ACM Transactions on Computer Systems (TOCS)
Checkpointing and Rollback-Recovery for Distributed Systems
IEEE Transactions on Software Engineering - Special issue on distributed systems
The causal ordering abstraction and a simple way to implement it
Information Processing Letters
Necessary and Sufficient Conditions for Consistent Global Snapshots
IEEE Transactions on Parallel and Distributed Systems
Distributed snapshots: determining global states of distributed systems
ACM Transactions on Computer Systems (TOCS)
About state recording in asynchronous computations
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Time, clocks, and the ordering of events in a distributed system
Communications of the ACM
Global States and Time in Distributed Systems
Global States and Time in Distributed Systems
A New Algorithm to Implement Causal Ordering
Proceedings of the 3rd International Workshop on Distributed Algorithms
The Role of Inhibition on Asynchronous Consistent-Cut Protocols
Proceedings of the 3rd International Workshop on Distributed Algorithms
Maximum and minimum consistent global checkpoints and their applications
SRDS '95 Proceedings of the 14TH Symposium on Reliable Distributed Systems
Vector time and causality among abstract events in distributed computations
Distributed Computing
The inhibition spectrum and the achievement of causal consistency
Distributed Computing
Detecting causal relationships in distributed computations: in search of the holy grail
Distributed Computing
Hi-index | 0.00 |
In this paper, we emplore the isomorphism between vector time and causality to characterize consistency of a set of checkpoints in a distributed computing. A necessary and sufficient condition, to determine if a set of checkpoints can form a consistent global checkpoint, is presented and proved using the isomorphic power of vector time and causality. To the best of our knowledge, this is the first attempt to use the isomorphism for this purpose. This condition leads to a simple and straightforward algorithm for a guaranteed mutually consistent global checkpointing. In our approach, a process can take a checkpoint whenever and wherever it wants while other related process may be asked to take an additional checkpoint for ensuring the mutual consistency. We also show how this condition and the resulting algorithm can be used to obtain a maximum and minimum global checkpoints, another important paradigm for distributed applications.