How to capture dynamic behaviours of dependable systems

  • Authors:
  • Salvatore Distefano

  • Affiliations:
  • Department of Mathematics, Engineering Faculty, University of Messina, Messina, Italy

  • Venue:
  • International Journal of Parallel, Emergent and Distributed Systems - Papers from the Workshop on Dependable Parallel and Network-Centric Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In terms of reliability, a unit, subsystem or system is considered dynamic if its failure probability is variable. From the system point of view, the reliability depends on the units' dynamics, on the inter-dependencies arising among such units (load-sharing, standby redundancy, interferences, etc.) and on their reliability relationships, that can also be variable (phased-mission systems). Such peculiarities have great impact on the choice of the technique to be used to evaluate the reliability of a system. Combinatorial techniques can be adopted in case the system's units are stochastically independent. Otherwise, it is requested to recur to lower level techniques and formalisms, such as: state space methods, hybrid (combinatorial/state space) techniques or simulation. This paper analyses the reliability of fault tolerant/dependent/dynamic systems. The approach exploited is based on the concept of dependencies and their composition. In the paper, we deeply investigate such concepts from a high level of abstraction. Moreover, basing on the use of dynamic reliability block diagrams, a notation we developed by extending the well known reliability block diagrams, we detail how this modelling approach captures dynamic reliability behaviours. In order to demonstrate the effectiveness of such technique, we mainly focus the discussion on fault tolerant-dynamic-dependent computing systems, investigating some dynamic reliability/availability behaviours and providing the guidelines for their representation and evaluation. The discussion is supported by the results obtained evaluating a complex fault tolerant computing system with several units affected by common cause events, shared workloads and other dynamic-dependable behaviours.