How to capture dynamic behaviours of dependable systems

Authors:
Salvatore Distefano
Affiliations:
Department of Mathematics, Engineering Faculty, University of Messina, Messina, Italy
Venue:
International Journal of Parallel, Emergent and Distributed Systems - Papers from the Workshop on Dependable Parallel and Network-Centric Systems
Year:
2009

Citing 12
Cited 1

Reliability Estimation of Fault-Tolerant Systems: Tools and Techniques

Computer
Performance and reliability analysis of computer systems: an example-based approach using the SHARPE software package

Performance and reliability analysis of computer systems: an example-based approach using the SHARPE software package
Queueing networks and Markov chains: modeling and performance evaluation with computer science applications

Queueing networks and Markov chains: modeling and performance evaluation with computer science applications
The Authoritative Dictionary of IEEE Standards Terms

The Authoritative Dictionary of IEEE Standards Terms
Reliability of Computer Systems and Networks: Fault Tolerance,Analysis,and Design

Reliability of Computer Systems and Networks: Fault Tolerance,Analysis,and Design
Parametric Fault Tree for the Dependability Analysis of Redundant Systems and Its High-Level Petri Net Semantics

IEEE Transactions on Software Engineering
SHARPE 2002: Symbolic Hierarchical Automated Reliability and Performance Evaluator

DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
The Möbius Modeling Tool

PNPM '01 Proceedings of the 9th international Workshop on Petri Nets and Performance Models (PNPM'01)
Computing System Reliability: Models And Analysis

Computing System Reliability: Models And Analysis
Automatically Translating Dynamic Fault Trees into Dynamic Bayesian Networks by Means of a Software Tool

ARES '06 Proceedings of the First International Conference on Availability, Reliability and Security
Modeling Distributed Computing System Reliability with DRBD

SRDS '06 Proceedings of the 25th IEEE Symposium on Reliable Distributed Systems
Annual Reliability and Maintainability Symposium

RAMS '06 Proceedings of the RAMS '06. Annual Reliability and Maintainability Symposium, 2006.

Dynamic aspects and behaviors of complex systems in performance and reliability assessment

ACM SIGMETRICS Performance Evaluation Review

Quantified Score

Hi-index	0.00

Visualization

Abstract

In terms of reliability, a unit, subsystem or system is considered dynamic if its failure probability is variable. From the system point of view, the reliability depends on the units' dynamics, on the inter-dependencies arising among such units (load-sharing, standby redundancy, interferences, etc.) and on their reliability relationships, that can also be variable (phased-mission systems). Such peculiarities have great impact on the choice of the technique to be used to evaluate the reliability of a system. Combinatorial techniques can be adopted in case the system's units are stochastically independent. Otherwise, it is requested to recur to lower level techniques and formalisms, such as: state space methods, hybrid (combinatorial/state space) techniques or simulation. This paper analyses the reliability of fault tolerant/dependent/dynamic systems. The approach exploited is based on the concept of dependencies and their composition. In the paper, we deeply investigate such concepts from a high level of abstraction. Moreover, basing on the use of dynamic reliability block diagrams, a notation we developed by extending the well known reliability block diagrams, we detail how this modelling approach captures dynamic reliability behaviours. In order to demonstrate the effectiveness of such technique, we mainly focus the discussion on fault tolerant-dynamic-dependent computing systems, investigating some dynamic reliability/availability behaviours and providing the guidelines for their representation and evaluation. The discussion is supported by the results obtained evaluating a complex fault tolerant computing system with several units affected by common cause events, shared workloads and other dynamic-dependable behaviours.