Manetho: Transparent Roll Back-Recovery with Low Overhead, Limited Rollback, and Fast Output Commit
IEEE Transactions on Computers - Special issue on fault-tolerant computing
On the relevance of communication costs of rollback-recovery protocols
Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
A Non-Blocking Recovery Algorithm for Causal Message Logging
SRDS '98 Proceedings of the The 17th IEEE Symposium on Reliable Distributed Systems
Hi-index | 0.00 |
To reduce the number of stable storage accesses and impose no restriction on the execution of live processes during recovery, Elnozahy proposed a recovery algorithm based on causal message logging. However, the algorithm with independent checkpointing may force the system to be in an inconsistent state when processes fail concurrently. In this paper, we identify these inconsistent cases and then present a recovery algorithm to perform consistent recovery by allowing the recovery leader to collect recovery information from the other recovering processes as well as all live ones. Our recovery algorithm requires no additional message compared with Elnozahy's algorithm.