Diagnosing performance changes by comparing request flows

  • Authors:
  • Raja R. Sambasivan;Alice X. Zheng;Michael De Rosa;Elie Krevat;Spencer Whitman;Michael Stroucken;William Wang;Lianghong Xu;Gregory R. Ganger

  • Affiliations:
  • Carnegie Mellon University;Microsoft Research;Google;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University

  • Venue:
  • Proceedings of the 8th USENIX conference on Networked systems design and implementation
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The causes of performance changes in a distributed system often elude even its developers. This paper develops a new technique for gaining insight into such changes: comparing request flows from two executions (e.g., of two system versions or time periods). Building on end-to-end request-flow tracing within and across components, algorithms are described for identifying and ranking changes in the flow and/or timing of request processing. The implementation of these algorithms in a tool called Spectroscope is evaluated. Six case studies are presented of using Spectroscope to diagnose performance changes in a distributed storage service caused by code changes, configuration modifications, and component degradations, demonstrating the value and efficacy of comparing request flows. Preliminary experiences of using Spectroscope to diagnose performance changes within select Google services are also presented.