Selectively retrofitting monitoring in distributed systems

  • Authors:
  • Animashree Anandkumar;Chatschik Bisdikian;Ting He;Dakshi Agrawal

  • Affiliations:
  • Massachusetts Institute of Technology, Cambridge, MA;IBM Watson Research, Hawthorne, NY;IBM Watson Research, Hawthorne, NY;IBM Watson Research, Hawthorne, NY

  • Venue:
  • ACM SIGMETRICS Performance Evaluation Review
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Current distributed systems carry legacy subsystems lacking sufficient instrumentation for monitoring the end-to-end business transactions supported by these systems. In the absence of instrumentation, only probabilistic monitoring is possible by using time-stamped log-records. Retro fitting these systems with expensive monitoring instrumentation provides high-granularity, precise tracking of transactions. Given a limited budget, local instrumentation strategies which maximize the effectiveness of monitoring transactions throughout the system are proposed. The operation of the end-to-end system is modeled by a queuing network;each queue represents a subsystem which produces time-stamped log-records as transactions pass through it. Two simple heuristics for instrumentation are proposed which become optimal under certain conditions. One heuristic selects states in the transition diagram for local instrumentation in the decreasing order of the load factors of their queues. Sufficient onditions for this load-factor heuristic to be optimal are proven using the notion of stochastic order. The other heuristic selects states in the transition diagram based on the approximated tracking accuracy of probabilistic monitoring at each state, which is shown to be tight at low arrival rates.