Enhancing The Fault-Tolerance of Nonmasking Programs

Authors:
Sandeep S. Kulkarni;Ali Ebnenasir
Affiliations:
-;-
Venue:
ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
Year:
2003

Citing 9
Cited 9

Closure and Convergence: A Foundation of Fault-Tolerant Computing

IEEE Transactions on Software Engineering - Special issue on software reliability
Synthesis of concurrent systems for an atomic read/atomic write model of computation

PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Synthesis of fault-tolerant concurrent programs

PODC '98 Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing
Designing Masking Fault-Tolerance via Nonmasking Fault-Tolerance

IEEE Transactions on Software Engineering
The Byzantine Generals Problem

ACM Transactions on Programming Languages and Systems (TOPLAS)
Synthesis of concurrent programs for an atomic read/write model of computation

ACM Transactions on Programming Languages and Systems (TOPLAS)
Automating the Addition of Fault-Tolerance

FTRTFT '00 Proceedings of the 6th International Symposium on Formal Techniques in Real-Time and Fault-Tolerant Systems
The Complexity of Adding Failsafe Fault-Tolerance

ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
Component based design of fault-tolerance

Component based design of fault-tolerance

Diconic addition of failsafe fault-tolerance

Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Complexity results in revising UNITY programs

ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Developing parallel programs: A design-oriented perspective

IWMSE '09 Proceedings of the 2009 ICSE Workshop on Multicore Software Engineering
A pattern-based approach for modeling and analyzing error recovery

Architecting dependable systems IV
Feasibility of Stepwise Design of Multitolerant Programs

ACM Transactions on Software Engineering and Methodology (TOSEM)
Mechanical verification of automatic synthesis of fault-tolerant programs

LOPSTR'04 Proceedings of the 14th international conference on Logic Based Program Synthesis and Transformation
Adding fault-tolerance using pre-synthesized components

EDCC'05 Proceedings of the 5th European conference on Dependable Computing
Revising UNITY programs: possibilities and limitations

OPODIS'05 Proceedings of the 9th international conference on Principles of Distributed Systems
Automated model repair for distributed programs

ACM SIGACT News

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we focus on automated techniques to enhance thefault-tolerance of a nonmasking fault-tolerant program to masking.A masking program continually satisfies its specification even iffaults occur. By contrast, a nonmasking program merely guaranteesthat after faults stop occurring, the program recovers to states fromwhere it continually satisfies its specification. Until the recovery iscomplete, however, a nonmasking program can violate its (safety)specification. Thus, the problem of enhancing fault-tolerance fromnonmasking to masking requires that safety be added and recoverybe preserved. We focus on this enhancement problem for high atomicityprograms -where each process can read all variables- and fordistributed programs -where restrictions are imposed on what processescan read and write. We present a sound and complete algorithmfor high atomicity programs and a sound algorithm for distributedprograms. We also argue that our algorithms are simplerthan previous algorithms, where masking fault-tolerance is addedto a fault-intolerant program. Hence, these algorithms can partiallyreap the benefits of automation when the cost of adding maskingfault-tolerance to a fault-intolerant program is high. To illustratethese algorithms, we show how the masking fault-tolerant programsfor triple modular redundancy and Byzantine agreement can be obtainedby enhancing the fault-tolerance of the corresponding non-maskingversions. We also discuss how the derivation of these programsis simplified when we begin with a nonmasking fault-tolerantprogram.