Feasibility of Stepwise Design of Multitolerant Programs

Authors:
Ali Ebnenasir;Sandeep S. Kulkarni
Affiliations:
Michigan Technological University;Michigan State University
Venue:
ACM Transactions on Software Engineering and Methodology (TOSEM)
Year:
2011

Citing 54
Cited 1

Graph-Based Algorithms for Boolean Function Manipulation

IEEE Transactions on Computers
On the synthesis of a reactive module

POPL '89 Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Self-stabilization by local checking and correction

Self-stabilization by local checking and correction
Closure and Convergence: A Foundation of Fault-Tolerant Computing

IEEE Transactions on Software Engineering - Special issue on software reliability
SuperStabilizing protocols for dynamic distributed systems

Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
Self-stabilizing clock synchronization with Byzantine faults

Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
The SCR method for formally specifying, verifying, and validating requirements: tool support

ICSE '97 Proceedings of the 19th international conference on Software engineering
The Model Checker SPIN

IEEE Transactions on Software Engineering - Special issue on formal methods in software practice
Synthesis of concurrent systems with many similar processes

ACM Transactions on Programming Languages and Systems (TOPLAS)
Component Based Design of Multitolerant Systems

IEEE Transactions on Software Engineering
Byzantine Agreement in the Presence of Mixed Faults on Processors and Links

IEEE Transactions on Parallel and Distributed Systems
Designing Masking Fault-Tolerance via Nonmasking Fault-Tolerance

IEEE Transactions on Software Engineering
The Unified Modeling Language reference manual

The Unified Modeling Language reference manual
Specification and verification of fault-tolerance, timing, and scheduling

ACM Transactions on Programming Languages and Systems (TOPLAS)
The Byzantine Generals Problem

ACM Transactions on Programming Languages and Systems (TOPLAS)
Synthesis of Communicating Processes from Temporal Logic Specifications

ACM Transactions on Programming Languages and Systems (TOPLAS)
Self-stabilizing systems in spite of distributed control

Communications of the ACM
Synthesis of concurrent programs for an atomic read/write model of computation

ACM Transactions on Programming Languages and Systems (TOPLAS)
Specifying Graceful Degradation

IEEE Transactions on Parallel and Distributed Systems
Stabilization-preserving atomicity refinement

Journal of Parallel and Distributed Computing - Self-stabilizing distributed systems
Model Checking Programs

Automated Software Engineering
Tolerating Transient and Permanent Failures (Extended Abstract)

WDAG '93 Proceedings of the 7th International Workshop on Distributed Algorithms
Automating the Addition of Fault-Tolerance

FTRTFT '00 Proceedings of the 6th International Symposium on Formal Techniques in Real-Time and Fault-Tolerant Systems
Infinite Games and Verification (Extended Abstract of a Tutorial)

CAV '02 Proceedings of the 14th International Conference on Computer Aided Verification
Abstracting C with abC

CAV '02 Proceedings of the 14th International Conference on Computer Aided Verification
Convergence Refinement

ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
Enhancing The Fault-Tolerance of Nonmasking Programs

ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
Refinement for Fault-Tolerance: An Aircraft Hand-off Protocol

Refinement for Fault-Tolerance: An Aircraft Hand-off Protocol
Synthesizing Distributed Systems

LICS '01 Proceedings of the 16th Annual IEEE Symposium on Logic in Computer Science
A lattice-theoretic characterization of safety and liveness

Proceedings of the twenty-second annual symposium on Principles of distributed computing
Component based design of fault-tolerance

Component based design of fault-tolerance
Synthesis of fault-tolerant concurrent programs

ACM Transactions on Programming Languages and Systems (TOPLAS)
Automated Synthesis of Multitolerance

DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
Basic Concepts and Taxonomy of Dependable and Secure Computing

IEEE Transactions on Dependable and Secure Computing
Computations on distributed discrete-event systems

Computations on distributed discrete-event systems
Design and verification of fault tolerant systems with CSP

Distributed Computing
Complexity Issues in Automated Synthesis of Failsafe Fault-Tolerance

IEEE Transactions on Dependable and Secure Computing
The Effect of the Specification Model on the Complexity of Adding Masking Fault Tolerance

IEEE Transactions on Dependable and Secure Computing
Automatic synthesis of fault-tolerance

Automatic synthesis of fault-tolerance
Exploiting Symbolic Techniques in Automated Synthesis of Distributed Programs with Large State Space

ICDCS '07 Proceedings of the 27th International Conference on Distributed Computing Systems
Diconic addition of failsafe fault-tolerance

Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Pattern-Based Modeling and Analysis of Failsafe Fault-Tolerance in UML

HASE '07 Proceedings of the 10th IEEE High Assurance Systems Engineering Symposium
Distributed reactive systems are hard to synthesize

SFCS '90 Proceedings of the 31st Annual Symposium on Foundations of Computer Science
Fast self-stabilizing byzantine tolerant digital clock synchronization

Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing
FTSyn: a framework for automatic synthesis of fault-tolerance

International Journal on Software Tools for Technology Transfer (STTT)
Model driven code checking

Automated Software Engineering
A Formal Method for Developing Provably Correct Fault-Tolerant Systems Using Partial Refinement and Composition

FM '09 Proceedings of the 2nd World Congress on Formal Methods
Logical Specification and Analysis of Fault Tolerant Systems Through Partial Model Checking

Electronic Notes in Theoretical Computer Science (ENTCS)
Symbolic synthesis of finite-state controllers for request-response specifications

CIAA'03 Proceedings of the 8th international conference on Implementation and application of automata
A pattern-based approach for modeling and analyzing error recovery

Architecting dependable systems IV
Byzantine self-stabilizing pulse in a bounded-delay model

SSS'07 Proceedings of the 9h international conference on Stabilization, safety, and security of distributed systems
Stabilizing trust and reputation for self-stabilizing efficient hosts in spite of Byzantine guests

SSS'07 Proceedings of the 9h international conference on Stabilization, safety, and security of distributed systems
Revising UNITY programs: possibilities and limitations

OPODIS'05 Proceedings of the 9th international conference on Principles of Distributed Systems
On self-stabilizing synchronous actions despite byzantine attacks

DISC'07 Proceedings of the 21st international conference on Distributed Computing

Facilitating the design of fault tolerance in transaction level SystemC programs

Theoretical Computer Science

Quantified Score

Hi-index	0.01

Visualization

Abstract

The complexity of designing programs that simultaneously tolerate multiple classes of faults, called multitolerant programs, is in part due to the conflicting nature of the fault tolerance requirements that must be met by a multitolerant program when different types of faults occur. To facilitate the design of multitolerant programs, we present sound and (deterministically) complete algorithms for stepwise design of two families of multitolerant programs in a high atomicity program model, where a process can read and write all program variables in an atomic step. We illustrate that if one needs to design failsafe (respectively, nonmasking) fault tolerance for one class of faults and masking fault tolerance for another class of faults, then a multitolerant program can be designed in separate polynomial-time (in the state space of the fault-intolerant program) steps regardless of the order of addition. This result has a significant methodological implication in that designers need not be concerned about unknown fault tolerance requirements that may arise due to unanticipated types of faults. Further, we illustrate that if one needs to design failsafe fault tolerance for one class of faults and nonmasking fault tolerance for a different class of faults, then the resulting problem is NP-complete in program state space. This is a counterintuitive result in that designing failsafe and nonmasking fault tolerance for the same class of faults can be done in polynomial time. We also present sufficient conditions for polynomial-time design of failsafe-nonmasking multitolerance. Finally, we demonstrate the stepwise design of multitolerance for a stable disk storage system, a token ring network protocol and a repetitive agreement protocol that tolerates Byzantine and transient faults. Our automatic approach decreases the design time from days to a few hours for the token ring program that is our largest example with 200 million reachable states and 8 processes.