Scalable fine-grained call path tracing

  • Authors:
  • Nathan R. Tallent;John Mellor-Crummey;Michael Franco;Reed Landrum;Laksono Adhianto

  • Affiliations:
  • Rice University, Houston, TX, USA;Rice University, Houston, TX, USA;Rice University, Houston, TX, USA;Stanford University, Stanford, CA, USA;Rice University, Houston, TX, USA

  • Venue:
  • Proceedings of the international conference on Supercomputing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Applications must scale well to make efficient use of even medium-scale parallel systems. Because scaling problems are often difficult to diagnose, there is a critical need for scalable tools that guide scientists to the root causes of performance bottlenecks. Although tracing is a powerful performance-analysis technique, tools that employ it can quickly become bottlenecks themselves. Moreover, to obtain actionable performance feedback for modular parallel software systems, it is often necessary to collect and present fine-grained context-sensitive data --- the very thing scalable tools avoid. While existing tracing tools can collect calling contexts, they do so only in a coarse-grained fashion; and no prior tool scalably presents both context- and time-sensitive data. This paper describes how to collect, analyze and present fine-grained call path traces for parallel programs. To scale our measurements, we use asynchronous sampling, whose granularity is controlled by a sampling frequency, and a compact representation. To present traces at multiple levels of abstraction and at arbitrary resolutions, we use sampling to render complementary slices of calling-context-sensitive trace data. Because our techniques are general, they can be used on applications that use different parallel programming models (MPI, OpenMP, PGAS). This work is implemented in HPCToolkit.