Pinpoint: Problem Determination in Large, Dynamic Internet Services

  • Authors:
  • Mike Y. Chen;Emre Kiciman;Eugene Fratkin;Armando Fox;Eric Brewer

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditional problem determination techniques rely on static dependency models that are difficult to generate accurately in today's large, distributed, and dynamic application environments such as e-commerce systems. In this paper, we present a dynamic analysis methodology that automates problem determination in these environments by 1) coarse-grained tagging of numerous real client requests as they travel through the system and 2) using data mining techniques to correlate the believed failures and successes of these requests to determine which components are most likely to be at fault. To validate our methodology, we have implemented Pinpoint, a framework for root cause analysis on the J2EE platform that requires no knowledge of the application components. Pinpoint consists of three parts: a communications layer that traces client requests, a failure detector that uses traffic-sniffing and middleware instrumentation, and a data analysis engine. We evaluatePinpoint by injecting faults into various application components and show that Pinpoint identifies the faulty components with high accuracy and produces few false-positives.