Checkpointing and Rollback-Recovery for Distributed Systems
IEEE Transactions on Software Engineering - Special issue on distributed systems
Fault tolerance in distributed systems
Fault tolerance in distributed systems
Optimistic Crash Recovery without Changing Application Messages
IEEE Transactions on Parallel and Distributed Systems
Consistent Global Checkpoints that Contain a Given Set of Local Checkpoints
IEEE Transactions on Computers
On Coordinated Checkpointing in Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Quasi-Synchronous Checkpointing: Models, Characterization, and Classification
IEEE Transactions on Parallel and Distributed Systems
Mutable Checkpoints: A New Checkpointing Approach for Mobile Computing Systems
IEEE Transactions on Parallel and Distributed Systems
Advanced Concepts in Operating Systems
Advanced Concepts in Operating Systems
ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
Hi-index | 0.03 |
In this paper, we have proposed a new approach toward designing a simple and efficient nonblock synchronous checkpointing algorithm for distributed systems. In general, such algorithms require all processes to take checkpoints, even though some of them may not be necessary. In the present work, if a process since its last checkpoint has sent some message(s), but none of which has yet been received, the process does not take a checkpoint. It reduces the number of checkpoints to be taken. This approach offers advantage particularly in case of mobile computing environment where both non-block checkpointing and reduction in the number of checkpoints help in the efficient use of the limited resources of mobile computing environment.