Fault-tolerant fault tolerance for component-based automation systems

  • Authors:
  • Manuel Oriol;Thomas Gamer;Thijmen de Gooijer;Michael Wahler;Ettore Ferranti

  • Affiliations:
  • ABB Schweiz AG Corporate Research, Industrial Software Systems, Badem. Switzerland, Switzerland;ABB AG Corporate Research Center Germany, Industrial Software Systems, Ladenburg, Germany;ABB AB Corporate Research, Industrial Software Systems, Västerås, Sweden;ABB Schweiz AG Corporate Research, Industrial Software Systems, Baden, Switzerland;ABB Schweiz AG Corporate Research, Industrial Software Systems, Baden, Switzerland

  • Venue:
  • Proceedings of the 4th international ACM Sigsoft symposium on Architecting critical systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

To guarantee high availability, automation systems must be fault-tolerant. To this end, they must provide redundant solutions for the critical parts of the system. Classical fault tolerance patterns such as standby or N-modular redundancy provide system stability in the case of a fault. Fault tolerance is subsequently degraded or, depending on the number of deployed replicas, often even unavailable until the system has been repaired. We introduce a combination of a component-based framework, redundancy patterns, and a runtime manager, which is able to provide fault tolerance, to detect host failures, and to trigger a reconfiguration of the system at runtime. This combined solution maintains system operation in case a fault occurs and automatically restores fault tolerance. The proposed solution is validated using a case study of an industrial distributed automation system. The validation shows how our solution quickly restores fault tolerance without the need for operator intervention or immediate hardware replacement while limiting the impact on other applications.