Using Time over Threshold to Reduce Noise in Performance and Fault Management Systems

  • Authors:
  • Mark W. Sylor;Lingmin Meng

  • Affiliations:
  • -;-

  • Venue:
  • DSOM '00 Proceedings of the 11th IFIP/IEEE International Workshop on Distributed Systems: Operations and Management: Services Management in Intelligent Networks
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Fault management systems detect performance problems and intermittent failures by periodically examining a metric (such as the utilization of a link), and raising an alarm if the value is above a threshold. Such systems can generate numerous alarms. Various schemes have been proposed for reducing the number of alarms, or filtering out the important ones. The time over threshold detection algorithm reduces the volume of alarms at the source detector. This paper describes an experiment that compares time over threshold against simple threshold crossings. The experiment demonstrates that it reduces the number of alarms raised by a factor of 25 to 1 without any significant reduction in the problems detected.