Issues on the design of efficient fail-safe fault tolerance

  • Authors:
  • Arshad Jhumka;Matt Leeke

  • Affiliations:
  • Department of Computer Science, University of Warwick, Coventry, UK;Department of Computer Science, University of Warwick, Coventry, UK

  • Venue:
  • ISSRE'09 Proceedings of the 20th IEEE international conference on software reliability engineering
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The design of a fault-tolerant program is known to be an inherently difficult task. Decisions taken during the design process will invariably have an impact on the efficiency of the resulting fault-tolerant program. In this paper, we focus on two such decisions, namely (i) the class of faults the program is to tolerate, and (ii) the variables that can be read and written. The impact these design issues have on the overall fault tolerance of the system needs to be well-understood, failure of which can lead to costly redesigns. For the case of understanding the impact of fault classes on the efficiency of fail-safe fault tolerance, we show that, under the assumption of a general fault model, it is impossible to preserve the original behavior of the fault-intolerant program. For the second problem of read and write constraints of variables, we again show that it is impossible to preserve the original behavior of the fault-intolerant program. We analyze the reasons that lead to these impossibility results, and suggest possible ways of circumventing them.