An alarm management framework for automated network fault identification

  • Authors:
  • Chi-Shih Chao;An-Chi Liu

  • Affiliations:
  • Department of Information Engineering, Feng Chia University, 100 Wenhua Road, Seatwen, Taichung 407, Taiwan, ROC;Department of Information Engineering, Feng Chia University, 100 Wenhua Road, Seatwen, Taichung 407, Taiwan, ROC

  • Venue:
  • Computer Communications
  • Year:
  • 2004

Quantified Score

Hi-index 0.24

Visualization

Abstract

Many timing constraint (or real-time) distributed systems, such as real-time database systems, are now being used in safety critical applications. However, they are subject to system failures caused by the malfunction of underlying network components. Without the helps of network experts or sophisticated management tools, most users cannot resolve these network problems by themselves. Sometimes, worse, it is usually prohibited to use these management tools, e.g. the 'ping' command, for the security sake. Accordingly, we develop a management system to automate network fault identification merely based on the analysis of the abnormal events from the monitored timing constraint distributed system. In this system, a fault identification framework is designed to identify automatically faulty network elements by using a two-level fault propagation model which combines Timing Constraint Petri nets with an alarm clustering mechanism. In addition, the concepts of redundant/ringleader alarms and innocent network elements are also introduced into the framework to obtain an effective diagnosis. At last, the management system is implemented according to the framework to demonstrate the performance of our fault identification.