Anomaly detection and diagnosis in grid environments

  • Authors:
  • Lingyun Yang;Chuang Liu;Jennifer M. Schopf;Ian Foster

  • Affiliations:
  • University of Chicago, Chicago, IL;Microsoft, Redmond, WA;Argonne National Laboratory, Argonne, IL;University of Chicago, Chicago, IL and Argonne National Laboratory, Argonne, IL

  • Venue:
  • Proceedings of the 2007 ACM/IEEE conference on Supercomputing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Identifying and diagnosing anomalies in application behavior is critical to delivering reliable application-level performance. In this paper we introduce a strategy to detect anomalies and diagnose the possible reasons behind them. Our approach extends the traditional window-based strategy by using signal-processing techniques to filter out recurring, background fluctuations in resource behavior. In addition, we have developed a diagnosis technique that uses standard monitoring data to determine which related changes in behavior may cause anomalies. We evaluate our anomaly detection and diagnosis technique by applying it in three contexts when we insert anomalies into the system at random intervals. The experimental results show that our strategy detects up to 96% of anomalies while reducing the false positive rate by up to 90% compared to the traditional window average strategy. In addition, our strategy can diagnose the reason for the anomaly approximately 75% of the time.