Emulation of Transient Software Faults for Dependability Assessment: A Case Study

  • Authors:
  • Roberto Natella;Domenico Cotroneo

  • Affiliations:
  • -;-

  • Venue:
  • EDCC '10 Proceedings of the 2010 European Dependable Computing Conference
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Fault Tolerance Mechanisms (FTMs) are extensively used in software systems to counteract software faults, in particular against faults that manifest transiently, namely Mandelbugs. In this scenario, Software Fault Injection (SFI) plays a key role for the verification and the improvement of FTMs. However, no previous work investigated whether SFI techniques are able to emulate Mandelbugs adequately. This is an important concern for assessing critical systems, since Mandelbugs are a major cause of failures, and FTMs are specifically tailored for this class of software faults. In this paper, we analyze an existing state-of-the-art SFI technique, namely G-SWFIT, in the context of a real-world fault-tolerant system for Air Traffic Control (ATC). The analysis highlights limitations of G-SWFIT regarding its ability to emulate the transient nature of Mandelbugs, because most of injected faults are activated in the early phase of execution, and they deterministically affect process replicas in the system. We also notice that G-SWFIT leaves untested the 35% of states of the considered system. Moreover, by means of an experiment, we show how emulation of Mandelbugs is useful to improve SFI. In particular, we emulate concurrency faults, which are a critical sub-class of Mandelbugs, in a fully representative way. We show that proper fault triggering can increase the confidence in FTMs' testing, since it is possible to reduce the amount of untested states down to 5%.