Availability Requirement for a Fault Management Server in High-Availability Communication Systems

  • Authors:
  • Hairong Sun;James J. Han;Isaac Levendel

  • Affiliations:
  • -;-;-

  • Venue:
  • IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we investigate the availability requirement for the fault management server in high-availability communication systems. According to our study, we find that the availability of the fault management server does not need to be 99.999% in order to guarantee a 99.999% system availability as long as the fail-safe ratio (the probability that the failure of the fault management server will not bring the system down) and the fault coverage ratio (the probability that the failure in the system can be detected and recovered by the fault management server) are sufficiently high. Tradeoffs can be made among the availability of the fault management server, the fail-safe ratio and the fault coverage ratio to optimize system availability. A cost-effective design for the fault management server is proposed in this paper.