Topology-Aware Correlated Network Anomaly Event Detection and Diagnosis

  • Authors:
  • Prasad Calyam;Manojprasadh Dhanapalan;Mukundan Sridharan;Ashok Krishnamurthy;Rajiv Ramnath

  • Affiliations:
  • University of Missouri-Columbia, Columbia, USA;The Ohio State University, Columbus, USA;The Samraksh Company, Dublin, USA;RENCI, Chapel Hill, USA;The Ohio State University, Columbus, USA

  • Venue:
  • Journal of Network and Systems Management
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

For purposes such as end-to-end monitoring, capacity planning, and performance bottleneck troubleshooting across multi-domain networks, there is an increasing trend to deploy interoperable measurement frameworks such as perfSONAR. These deployments expose vast data archives of current and historic measurements, which can be queried using web services. Analysis of these measurements using effective schemes to detect and diagnose anomaly events is vital since it allows for verifying if network behavior meets expectations. In addition, it allows for proactive notification of bottlenecks that may be affecting a large number of users. In this paper, we describe our novel topology-aware scheme that can be integrated into perfSONAR deployments for detection and diagnosis of network-wide correlated anomaly events. Our scheme involves spatial and temporal analyses on combined topology and uncorrelated anomaly events information for detection of correlated anomaly events. Subsequently, a set of `filters' are applied on the detected events to prioritize them based on potential severity, and to drill-down upon the events "nature" (e.g., event burstiness) and "root-location(s)" (e.g., edge or core location affinity). To validate our scheme, we use traceroute information and one-way delay measurements collected over 3 months between the various U.S. Department of Energy national lab network locations, published via perfSONAR web services. Further, using real-world case studies, we show how our scheme can provide helpful insights for detection, visualization and diagnosis of correlated network anomaly events, and can ultimately save time, effort, and costs spent on network management.