Human-guided machine learning for fast and accurate network alarm triage

  • Authors:
  • Saleema Amershi;Bongshin Lee;Ashish Kapoor;Ratul Mahajan;Blaine Christian

  • Affiliations:
  • Microsoft Research, Redmond, WA and Computer Science & Engineering, DUB, University of Washington, Seattle, WA;Microsoft Research, Redmond, WA;Microsoft Research, Redmond, WA;Microsoft Research, Redmond, WA;Microsoft Corporation, Redmond, WA

  • Venue:
  • IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Network alarm triage refers to grouping and prioritizing a stream of low-level device health information to help operators find and fix problems. Today, this process tends to be largely manual because existing rule-based tools cannot easily evolve with the network. We present CueT, a system that uses interactive machine learning to constantly learn from the triaging decisions of operators. It then uses that learning in novel visualizations to help them quickly and accurately triage alarms. Unlike prior interactive machine learning systems, CueT handles a highly dynamic environment where the groups of interest are not known a priori and evolve constantly. Our evaluations with real operators and data from a large network show that CueT significantly improves the speed and accuracy of alarm triage.