A Distributed Algorithm for Fault Diagnosis in Systems with Soft Failures
IEEE Transactions on Computers
Fault-Tolerant Networks Based on the de Bruijn Graph
IEEE Transactions on Computers
Hamiltonian graphs with minimum number of edges for fault-tolerant topologies
Information Processing Letters
Distributed fault-tolerance for large multiprocessor systems
ISCA '80 Proceedings of the 7th annual symposium on Computer Architecture
Designing Reliable Architecture for Stateful Fault Tolerance
PDCAT '06 Proceedings of the Seventh International Conference on Parallel and Distributed Computing, Applications and Technologies
Spin model checker, the: primer and reference manual
Spin model checker, the: primer and reference manual
Hi-index | 0.00 |
In [8], a high availability framework based on Harary graph as network topology has been proposed for stateful failover. Framework proposed therein exhibits an interesting property that an uniform load can be given to each non-faulty node while maintaining fault tolerance. A challenging problem in this context, which has not been addressed in [8] is to be able to come up with a distributed algorithm of automated fault recovery which can exploit the properties exhibited by the framework. In this work, we propose a distributed algorithm with low message and round complexity for automated fault recovery in case of stateful failover. We then prove the correctness of the algorithm using techniques from formal verification. The safety, liveness and the timeliness properties of the algorithm have been verified by the model checker SPIN.