Impact of Fault Management Server and Its Failure-related Parameters on High-Availability Communication Systems

  • Authors:
  • Hairong Sun;James J. Han;Isaac Levendel

  • Affiliations:
  • -;-;-

  • Venue:
  • DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
  • Year:
  • 2002

Quantified Score

Hi-index 0.01

Visualization

Abstract

In this paper, we investigate the impact of a fault management server and its failure-related parameters on high-availability communication systems. The key point is that, to achieve high overall availability of a communication system, the availability of the fault management server itself is not as important as its fail-safe ratio and fault coverage. In other words, in building fault management servers, more attention should be paid to improving the server's ability ofdetecting faults in functional units and its own isolation under failure from the functional units. Tradeoffs can be made between the availability of the fault management server, the fail-safe ratio and the fault coverage ratio to optimize system availability. A cost-effective design for the fault management server is proposed in this paper.