Availability of enterprise IT systems: an expert-based Bayesian framework

  • Authors:
  • Ulrik Franke;Pontus Johnson;Johan König;Liv Marcks Von Würtemberg

  • Affiliations:
  • Industrial Information and Control Systems, Royal Institute of Technology, Stockholm, Sweden;Industrial Information and Control Systems, Royal Institute of Technology, Stockholm, Sweden;Industrial Information and Control Systems, Royal Institute of Technology, Stockholm, Sweden;Industrial Information and Control Systems, Royal Institute of Technology, Stockholm, Sweden

  • Venue:
  • Software Quality Control
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Ensuring the availability of enterprise IT systems is a challenging task. The factors that can bring systems down are numerous, and their impact on various system architectures is difficult to predict. At the same time, maintaining high availability is crucial in many applications, ranging from control systems in the electric power grid, over electronic trading systems on the stock market to specialized command and control systems for military and civilian purposes. This paper describes a Bayesian decision support model, designed to help enterprise IT system decision-makers evaluate the consequences of their decisions by analyzing various scenarios. The model is based on expert elicitation from 50 experts on IT systems availability, obtained through an electronic survey. The Bayesian model uses a leaky Noisy-OR method to weigh together the expert opinions on 16 factors affecting systems availability. Using this model, the effect of changes to a system can be estimated beforehand, providing decision support for improvement of enterprise IT systems availability. The Bayesian model thus obtained is then integrated within a standard, reliability block diagram-style, mathematical model for assessing availability on the architecture level. In this model, the IT systems play the role of building blocks. The overall assessment framework thus addresses measures to ensure high availability both on the level of individual systems and on the level of the entire enterprise architecture. Examples are presented to illustrate how the framework can be used by practitioners aiming to ensure high availability.