Testing of fault-tolerant and real-time distributed systems via protocol fault injection

  • Authors:
  • S. Dawson;F. Jahanian;T. Mitton;Teck-Lee Tung

  • Affiliations:
  • -;-;-;-

  • Venue:
  • FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

As software for distributed systems becomes more complex, ensuring that a system meets its prescribed specification is a growing challenge that confronts software developers. This is particularly important for distributed applications with strict dependability and timeliness constraints. This paper reports on ORCHESTRA, a portable fault injection environment for testing implementations of distributed protocols. This tool is based on a simple yet powerful framework called script-driven probing and fault injection, for the evaluation and validation of the fault-tolerance and timing characteristics of distributed protocols. The tool, which was initially developed on the Real-Time Mach operating system and later ported to other platforms including Solaris and SunOS, has been used to conduct extensive experiments on several protocol implementations. This paper describes the design and implementation of the fault injection tool focusing on architectural features to support portability, minimizing intrusiveness on target protocols, and explicit support for testing real-time systems. The paper also describes the experimental evaluation of two protocol implementations: a real-time audio-conferencing application on Real-Time Mach, and a distributed group membership service on the Sun Solaris operating system.