(Awarded Best Theory Paper!) A Probabilistic Approach to Estimating Computer System Reliability

  • Authors:
  • Robert Apthorpe

  • Affiliations:
  • Excite@Home, Inc.

  • Venue:
  • LISA '01 Proceedings of the 15th USENIX conference on System administration
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Probabilistic Risk Assessment (PRA) is a method of estimating system reliability by combining logic models of the ways systems can fail with numerical failure rates. One postulates a failure state and systematically decomposes this state into a combination of more basic events through a process known as Fault Tree Analysis (FTA). Failure rates are derived from vendor specifications, historical trends, on-call reports, and many other sources. FTA has been used for decades in the defense, aerospace, and nuclear power industries to manage risk and increase reliability of complex engineering systems. Combining FTA with event tree analysis (ETA), one can associate failure probabilities with consequences to clearly communicate risk both pictorially and numerically. Basic PRA techniques can help increase the reliability and security of computer systems.