Closure and Convergence: A Foundation of Fault-Tolerant Computing
IEEE Transactions on Software Engineering - Special issue on software reliability
Synthesis of concurrent systems for an atomic read/atomic write model of computation
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Synthesis of fault-tolerant concurrent programs
PODC '98 Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing
Designing Masking Fault-Tolerance via Nonmasking Fault-Tolerance
IEEE Transactions on Software Engineering
The Byzantine Generals Problem
ACM Transactions on Programming Languages and Systems (TOPLAS)
Synthesis of concurrent programs for an atomic read/write model of computation
ACM Transactions on Programming Languages and Systems (TOPLAS)
Automating the Addition of Fault-Tolerance
FTRTFT '00 Proceedings of the 6th International Symposium on Formal Techniques in Real-Time and Fault-Tolerant Systems
The Complexity of Adding Failsafe Fault-Tolerance
ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
Component based design of fault-tolerance
Component based design of fault-tolerance
Diconic addition of failsafe fault-tolerance
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Complexity results in revising UNITY programs
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Developing parallel programs: A design-oriented perspective
IWMSE '09 Proceedings of the 2009 ICSE Workshop on Multicore Software Engineering
A pattern-based approach for modeling and analyzing error recovery
Architecting dependable systems IV
Feasibility of Stepwise Design of Multitolerant Programs
ACM Transactions on Software Engineering and Methodology (TOSEM)
Mechanical verification of automatic synthesis of fault-tolerant programs
LOPSTR'04 Proceedings of the 14th international conference on Logic Based Program Synthesis and Transformation
Adding fault-tolerance using pre-synthesized components
EDCC'05 Proceedings of the 5th European conference on Dependable Computing
Revising UNITY programs: possibilities and limitations
OPODIS'05 Proceedings of the 9th international conference on Principles of Distributed Systems
Automated model repair for distributed programs
ACM SIGACT News
Hi-index | 0.00 |
In this paper, we focus on automated techniques to enhance thefault-tolerance of a nonmasking fault-tolerant program to masking.A masking program continually satisfies its specification even iffaults occur. By contrast, a nonmasking program merely guaranteesthat after faults stop occurring, the program recovers to states fromwhere it continually satisfies its specification. Until the recovery iscomplete, however, a nonmasking program can violate its (safety)specification. Thus, the problem of enhancing fault-tolerance fromnonmasking to masking requires that safety be added and recoverybe preserved. We focus on this enhancement problem for high atomicityprograms -where each process can read all variables- and fordistributed programs -where restrictions are imposed on what processescan read and write. We present a sound and complete algorithmfor high atomicity programs and a sound algorithm for distributedprograms. We also argue that our algorithms are simplerthan previous algorithms, where masking fault-tolerance is addedto a fault-intolerant program. Hence, these algorithms can partiallyreap the benefits of automation when the cost of adding maskingfault-tolerance to a fault-intolerant program is high. To illustratethese algorithms, we show how the masking fault-tolerant programsfor triple modular redundancy and Byzantine agreement can be obtainedby enhancing the fault-tolerance of the corresponding non-maskingversions. We also discuss how the derivation of these programsis simplified when we begin with a nonmasking fault-tolerantprogram.