Deadlocks in fully uncoordinated checkpointing rollback recovery systems

Authors:
V. Shah;S. Sanyal;S. Bhattacharya
Affiliations:
-;-;-
Venue:
WORDS '97 Proceedings of the 3rd Workshop on Object-Oriented Real-Time Dependable Systems - (WORDS '97)
Year:
1997

Citing 5
Cited 0

Optimistic recovery in distributed systems

ACM Transactions on Computer Systems (TOCS)
Checkpointing and Rollback-Recovery for Distributed Systems

IEEE Transactions on Software Engineering - Special issue on distributed systems
On the Optimal Total Processing Time Using Checkpoints

IEEE Transactions on Software Engineering
An Efficient Protocol for Checkpointing Recovery in Distributed Systems

IEEE Transactions on Parallel and Distributed Systems
A low-overhead recovery technique using quasi-synchronous checkpointing

ICDCS '96 Proceedings of the 16th International Conference on Distributed Computing Systems (ICDCS '96)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Synchronization issues in checkpointing and rollback recovery schemes have been dealt with in depth over the past few years. The authors investigate the possibility of deadlocks in a fully uncoordinated checkpointing system. A protocol is first illustrated for a fully uncoordinated checkpointing scheme. Rollback propagation analysis (RPA) is performed using a stack based algorithm. The probability of deadlock (due to rollbacks) for a finite buffer size is then computed. The optimal number of buffers required to eliminate the possibility of deadlock is calculated. Finally a comparative analysis is performed between the predicted buffer size and the simulated result. The simulation study shows that the probability of deadlock decreases as the number of buffers increases, till an optimal buffer size is reached where the deadlock probability becomes zero.