Adaptive system anomaly prediction for large-scale hosting infrastructures

  • Authors:
  • Yongmin Tan;Xiaohui Gu;Haixun Wang

  • Affiliations:
  • North Carolina State University, Raleigh, NC, USA;North Carolina State University, Raleigh, NC, USA;Microsoft Research Asia, Beijing, China

  • Venue:
  • Proceedings of the 29th ACM SIGACT-SIGOPS symposium on Principles of distributed computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large-scale hosting infrastructures require automatic system anomaly management to achieve continuous system operation. In this paper, we present a novel adaptive runtime anomaly prediction system, called ALERT, to achieve robust hosting infrastructures. In contrast to traditional anomaly detection schemes, ALERT aims at raising advance anomaly alerts to achieve just-in-time anomaly prevention. We propose a novel context-aware anomaly prediction scheme to improve prediction accuracy in dynamic hosting infrastructures. We have implemented the ALERT system and deployed it on several production hosting infrastructures such as IBM System S stream processing cluster and PlanetLab. Our experiments show that ALERT can achieve high prediction accuracy for a range of system anomalies and impose low overhead to the hosting infrastructure.