A comparative analysis of the reliability of simple and two-level checkpointing techniques in two different distributed industrial control system architectures

  • Authors:
  • Alicia Rubio;Rafael Ors

  • Affiliations:
  • Fault Tolerant Computing Group (GSTF), Departamento de Informática de Sistemas y Computadores (DISCA), Politechnical University of Valencia (UPV), Camino de Vera s/n., 46022 Valencia, Spain;Fault Tolerant Computing Group (GSTF), Departamento de Informática de Sistemas y Computadores (DISCA), Politechnical University of Valencia (UPV), Camino de Vera s/n., 46022 Valencia, Spain

  • Venue:
  • Systems Analysis Modelling Simulation
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

In Distributed industrial control systems it is necessary to guarantee certain reliability level. In this sense, Checkpointing and Rollback techniques offer interesting possibilities to achieve fault tolerance without appreciable cost and complexity increment. Several Checkpointing techniques have been proposed. Most of them suppose the presence of stable storage in the system. But distributed industrial control systems usually do not dispose of this kind of storage. So, another storage strategy has to be employed. If Checkpoints were locally stored (Simple Checkpointing), the system tolerates only transient faults. If Checkpoints were locally, at the same node, and, additionally, at another/s node/s of the system stored (Two-level Checkpointing), the system can recover from some permanent faults too. In this article the results of a study of the reliability of these two different Checkpoint storage strategies were presented in order to evaluate if the reliability increase of the Two-level method justifies its greater complexity. In order to accomplish this study, two distributed industrial control systems were presented. Each of them are based on a different node architecture which will have an important effect upon the results of the study.