Automatic alarm correlation for fault identification

  • Authors:
  • I. Rouvellou;G. W. Hart

  • Affiliations:
  • -;-

  • Venue:
  • INFOCOM '95 Proceedings of the Fourteenth Annual Joint Conference of the IEEE Computer and Communication Societies (Vol. 2)-Volume - Volume 2
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

In communication networks, a large number of alarms exist to signal any abnormal behavior of the network. As network faults typically result in a number of alarms, correlating these different alarms and identifying their source is a major problem in fault management. The alarm correlation problem is of major practical significance. Alarms that have not been correlated may not only lead to significant misdirected efforts, based on insufficient information, but may cause multiple corrective actions (possibly contradictory) as each alert is handled independently. The paper proposes a general framework to solve the alarm correlation problem. The authors introduce a new model for faults and alarms based on probabilistic finite state machines. They propose two algorithms. The first one acquires the fault models starting from possibly incomplete and incorrect date. The second one correlates alarms in the presence of multiple faults and noisy information. Both algorithms have polynomial time complexity, use an extension of the Viterbi algorithm to deal with the corrupted data, and can be implemented in hardware. As an example, they are applied to analyse faults using data generated by the ANS (Advanced Network and Services, Inc.)/NSF T3 network.