Analysis and Modeling of Correlated Failures in Multicomputer Systems

  • Authors:
  • Dong Tang;Ravishankar K. Iyer

  • Affiliations:
  • -;-

  • Venue:
  • IEEE Transactions on Computers - Special issue on fault-tolerant computing
  • Year:
  • 1992

Quantified Score

Hi-index 0.00

Visualization

Abstract

Based on the measurements from two DEC VAX-cluster multicomputer systems, the issue of correlated failures is addressed. In particular, the characteristics of correlated failures, their impact and their modelling on dependability, are discussed. It is found from the data that most correlated failures are related to errors in shared resources and propagate from one machine to another. Comparisons between measurement-based models and analytical models that assume failure independence show that the impact of correlated failures on dependability is significant. Two validated models. the c-dependent model and the p-dependent model, are developed to evaluate the dependability of systems with correlated failures.