The consensus problem in fault-tolerant computing
ACM Computing Surveys (CSUR)
Closure and Convergence: A Foundation of Fault-Tolerant Computing
IEEE Transactions on Software Engineering - Special issue on software reliability
Impossibility of distributed consensus with one faulty process
Journal of the ACM (JACM)
The Byzantine Generals Problem
ACM Transactions on Programming Languages and Systems (TOPLAS)
Synthesis of concurrent programs for an atomic read/write model of computation
ACM Transactions on Programming Languages and Systems (TOPLAS)
A Discipline of Programming
Advanced Concepts in Operating Systems
Advanced Concepts in Operating Systems
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Compositional Design of Multitolerant Repetitive Byzantine Agreement
Proceedings of the 17th Conference on Foundations of Software Technology and Theoretical Computer Science
Detectors and Correctors: A Theory of Fault-Tolerance Components
ICDCS '98 Proceedings of the The 18th International Conference on Distributed Computing Systems
The Complexity of Adding Failsafe Fault-Tolerance
ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
Component based design of fault-tolerance
Component based design of fault-tolerance
Automatic synthesis of fault-tolerance
Automatic synthesis of fault-tolerance
Designing Run-Time Fault-Tolerance Using Dynamic Updates
SEAMS '07 Proceedings of the 2007 International Workshop on Software Engineering for Adaptive and Self-Managing Systems
Diconic addition of failsafe fault-tolerance
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Developing parallel programs: A design-oriented perspective
IWMSE '09 Proceedings of the 2009 ICSE Workshop on Multicore Software Engineering
Automating the addition of fault tolerance with discrete controller synthesis
Formal Methods in System Design
Weakest Invariant Generation for Automated Addition of Fault-Tolerance
Electronic Notes in Theoretical Computer Science (ENTCS)
A pattern-based approach for modeling and analyzing error recovery
Architecting dependable systems IV
Feasibility of Stepwise Design of Multitolerant Programs
ACM Transactions on Software Engineering and Methodology (TOSEM)
Robustness in the presence of liveness
CAV'10 Proceedings of the 22nd international conference on Computer Aided Verification
Hi-index | 0.00 |
We focus on the problem of synthesizing failsafe fault-tolerance where fault-tolerance is added to an existing (fault-intolerant) program. A failsafe fault-tolerant program satisfies its specification (including safety and liveness) in the absence of faults. However, in the presence of faults, it satisfies its safety specification. We present a somewhat unexpected result that, in general, the problem of synthesizing failsafe fault-tolerant distributed programs from their fault-intolerant version is NP-complete in the state space of the program. We also identify a class of specifications, monotonic specifications, and a class of programs, monotonic programs, for which the synthesis of failsafe fault-tolerance can be done in polynomial time (in program state space). As an illustration, we show that the monotonicity restrictions are met for commonly encountered problems, such as Byzantine agreement, distributed consensus, and atomic commitment. Furthermore, we evaluate the role of these restrictions in the complexity of synthesizing failsafe fault-tolerance. Specifically, we prove that if only one of these conditions is satisfied, the synthesis of failsafe fault-tolerance is still NP-complete. Finally, we demonstrate the application of monotonicity property in enhancing the fault-tolerance of (distributed) nonmasking fault-tolerant programs to masking.