Rigorous Development of an Embedded Fault-Tolerant System Based on Coordinated Atomic Actions

  • Authors:
  • Jie Xu;Alexander Romanovsky;Robert J. Stroud;Avelino F. Zorzo;Ercument Canver;Friedrich von Henke

  • Affiliations:
  • Univ. of Durham, Durham, UK;Univ. of Newcastle, Newscastle, UK;Univ. of Newcastle, Newscastle, UK;PUCRS, Brazil;Univ. of Ulm, Ulm, Germany;Univ. of Ulm, Ulm, Germany

  • Venue:
  • IEEE Transactions on Computers - Special issue on fault-tolerant embedded systems
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes our experience using coordinated atomic (CA) actions as a system structuring tool to design and validate a sophisticated and embedded control system for a complex industrial application that has high reliability and safety requirements. Our study is based on an extended production cell model, the specification and simulator for which were defined and developed by FZI (Forschungszentrum Informatik, Germany). This "Fault-Tolerant Production Cell" represents a manufacturing process involving redundant mechanical devices (provided in order to enable continued production in the presence of machine faults). The challenge posed by the model specification is to design a control system that maintains specified safety and liveness properties even in the presence of a large number and variety of device and sensor failures. Based on an analysis of such failures, we provide in this paper details of: 1) a design for a control program that uses CA actions to deal with both safety-related and fault tolerance concerns and 2) the formal verification of this design based on the use of model-checking. We found that CA action structuring facilitated both the design and verification tasks by enabling the various safety problems (involving possible clashes of moving machinery) to be treated independently. Even complex situations involving the concurrent occurrence of any pairs of the many possible mechanical and sensor failures can be handled simply yet appropriately. The formal verification activity was performed in parallel with the design activity and the interaction between them resulted in a combined exercise in "design for validation"; formal verification was very valuable in identifying some very subtle residual bugs in early versions of our design which would have been difficult to detect otherwise.