Model based approach for autonomic availability management

  • Authors:
  • Kesari Mishra;Kishor S. Trivedi

  • Affiliations:
  • Dept. of Electrical and Computer Engineering, Duke University, Durham, NC;Dept. of Electrical and Computer Engineering, Duke University, Durham, NC

  • Venue:
  • ISAS'06 Proceedings of the Third international conference on Service Availability
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

As increasingly complex computer systems have started playing a controlling role in all aspects of modern life, system availability and associated downtime of technical systems have acquired critical importance. Losses due to system downtime have risen manifold and become wide-ranging. Even though the component level availability of hardware and software has increased considerably, system wide availability still needs improvement as the heterogeneity of components and the complexity of interconnections has gone up considerably too. As systems become more interconnected and diverse, architects are less able to anticipate and design for every interaction among components, leaving such issues to be dealt with at runtime. Therefore, in this paper, we propose an approach for autonomic management of system availability, which provides real-time evaluation, monitoring and management of the availability of systems in critical applications. A hybrid approach is used where analytic models provide the behavioral abstraction of components/subsystems, their interconnections and dependencies and statistical inference is applied on the data from real time monitoring of those components and subsystems, to parameterize the system availability model. The model is solved online (that is, in real time) so that at any instant of time, both the point as well as the interval estimates of the overall system availability are obtained by propagating the point and the interval estimates of each of the input parameters, through the system model. The online monitoring and estimation of system availability can then lead to adaptive online control of system availability.