The Use of Self Checks and Voting in Software Error Detection: An Empirical Study
IEEE Transactions on Software Engineering
Unreliable failure detectors for reliable distributed systems
Journal of the ACM (JACM)
Dependability: Basic Concepts and Terminology
Dependability: Basic Concepts and Terminology
Stepwise Development of Fault-Tolerant Reactive Systems
ProCoS Proceedings of the Third International Symposium Organized Jointly with the Working Group Provably Correct Systems on Formal Techniques in Real-Time and Fault-Tolerant Systems
Verifying Fault Tolerance of Distributed Algorithms Formally - An Example
CSD '98 Proceedings of the 1998 International Conference on Application of Concurrency to System Design
Detectors and Correctors: A Theory of Fault-Tolerance Components
ICDCS '98 Proceedings of the The 18th International Conference on Distributed Computing Systems
The Complexity of Adding Failsafe Fault-Tolerance
ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
Component based design of fault-tolerance
Component based design of fault-tolerance
Proving the Correctness of Multiprocess Programs
IEEE Transactions on Software Engineering
An approach to synthesise safe systems
International Journal of Security and Networks
A framework of safe stabilization
SSS'03 Proceedings of the 6th international conference on Self-stabilizing systems
Hi-index | 0.00 |
The design of a fault-tolerant program is known to be an inherently difficult task. Decisions taken during the design process will invariably have an impact on the efficiency of the resulting fault-tolerant program. In this paper, we focus on two such decisions, namely (i) the class of faults the program is to tolerate, and (ii) the variables that can be read and written. The impact these design issues have on the overall fault tolerance of the system needs to be well-understood, failure of which can lead to costly redesigns. For the case of understanding the impact of fault classes on the efficiency of fail-safe fault tolerance, we show that, under the assumption of a general fault model, it is impossible to preserve the original behavior of the fault-intolerant program. For the second problem of read and write constraints of variables, we again show that it is impossible to preserve the original behavior of the fault-intolerant program. We analyze the reasons that lead to these impossibility results, and suggest possible ways of circumventing them.