Detecting application-level failures in component-based Internet services

  • Authors:
  • E. Kiciman;A. Fox

  • Affiliations:
  • Dept. of Comput. Sci., Stanford Univ., CA, USA;-

  • Venue:
  • IEEE Transactions on Neural Networks
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most Internet services (e-commerce, search engines, etc.) suffer faults. Quickly detecting these faults can be the largest bottleneck in improving availability of the system. We present Pinpoint, a methodology for automating fault detection in Internet services by: 1) observing low-level internal structural behaviors of the service; 2) modeling the majority behavior of the system as correct; and 3) detecting anomalies in these behaviors as possible symptoms of failures. Without requiring any a priori application-specific information, Pinpoint correctly detected 89%-96% of major failures in our experiments, as compared with 20%-70% detected by current application-generic techniques.