Networkmd: topology inference and failure diagnosis in the last mile

  • Authors:
  • Yun Mao;Hani Jamjoom;Shu Tao;Jonathan M. Smith

  • Affiliations:
  • University of Pennsylvania, Philadelphia, PA;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;University of Pennsylvania, Philadelphia, PA

  • Venue:
  • Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Health monitoring, automated failure localization and diagnosis have all become critical to service providers of large distribution networks (e.g., digital cable and fiber-to-the-home), due to the increases in scale and complexity of their offered services. Existing automated failure diagnosis solutions typically assume complete knowledge of network topology, which in practice is rarely available. The solution presented in this paper - Network Management and Diagnosis (NetworkMD) - is an automated failure diagnosis system that can infer failure groups based on historical failure data, and optionally geographical information. The inferred failure groups mirror missing topologies, and can be used to localize failures, diagnose root causes of problems, and detect misconfiguration in known topologies. NetworkMD uses an unsupervised learning algorithm based on non-negative matrix factorization (NMF) to infer failure groups. Using cable network as the primary example, we demonstrate the effectiveness of NetworkMD in both simulated settings and real environment using data collected from a commercial network serving hundreds of thousands of customers via thousands of intermediate network devices.