Synthesis of Fault-Tolerant Distributed Systems
ATVA '09 Proceedings of the 7th International Symposium on Automated Technology for Verification and Analysis
A pattern-based approach for modeling and analyzing error recovery
Architecting dependable systems IV
Feasibility of Stepwise Design of Multitolerant Programs
ACM Transactions on Software Engineering and Methodology (TOSEM)
Robustness in the presence of liveness
CAV'10 Proceedings of the 22nd international conference on Computer Aided Verification
Automated model repair for distributed programs
ACM SIGACT News
A Lightweight Method for Automated Design of Convergence in Network Protocols
ACM Transactions on Autonomous and Adaptive Systems (TAAS) - Special Section: Extended Version of SASO 2011 Best Paper
Action-based discovery of satisfying subsets: A distributed method for model correction
Information and Software Technology
Facilitating the design of fault tolerance in transaction level SystemC programs
Theoretical Computer Science
Hi-index | 0.00 |
In this paper, we present a software framework for adding fault-tolerance to existing finite-state programs. The input to our framework is a fault-intolerant program and a class of faults that perturbs the program. The output of our framework is a fault-tolerant version of the input program. Our framework provides (1) the first automated tool for the synthesis of fault-tolerant distributed programs, and (2) an extensible platform for researchers to develop a repository of heuristics that deal with the complexity of adding fault-tolerance to distributed programs. We also present a set of heuristics for polynomial-time addition of fault-tolerance to distributed programs. We have used this framework for automated synthesis of several fault-tolerant programs including a simplified version of an aircraft altitude switch, token ring, Byzantine agreement, and agreement in the presence of Byzantine and fail-stop faults. These examples illustrate that our framework can be used for synthesizing programs that tolerate different types of faults (process restarts, Byzantine and fail-stop) and programs that are subject to multiple faults (Byzantine and fail-stop) simultaneously. We have found our framework to be highly useful for pedagogical purposes, especially for teaching concepts of fault-tolerance, automatic program transformation, and the effect of heuristics.