Modeling the parallel execution of black-box services

  • Authors:
  • Gideon Mann;Mark Sandler;Darja Krushevskaja;Sudipto Guha;Eyal Even-Dar

  • Affiliations:
  • Google Inc., New York, NY;Google Inc., New York, NY;Rutgers University, New Brunwick, NJ;University of Pennsylvania, Philadelphia, PA;Final Inc., Herzliya-Pituach, Israel

  • Venue:
  • HotCloud'11 Proceedings of the 3rd USENIX conference on Hot topics in cloud computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Services running in a data center frequently rely on RPCs to child services (e.g. storage, cache, authentication), and their latency depends crucially on latencies of those RPCs. However, even though service latency often comes exclusively from the time spent inside remote calls, it is difficult to determine parent latency since multithreading and asynchronous RPCs lead to complex and non-linear dependencies between service and RPC latencies. In this paper, we present a model that can be used to estimate parent latency given RPC latencies, where the parallel dependencies among of child services are modeled by an "execution flow", a direct acyclic graph. The model is learned from samples collected by a distributed tracing tool. Experiments demonstrate that these models are better able to predict top-level parent latency from child latency than state-of-the-art baselines such as linear regression and critical path analysis.