Trace-based parallel performance overhead compensation

  • Authors:
  • Felix Wolf;Allen D. Malony;Sameer Shende;Alan Morris

  • Affiliations:
  • Innovative Computing Laboratory, University of Tennessee;Department of Computer and Information Science, University of Oregon;Department of Computer and Information Science, University of Oregon;Department of Computer and Information Science, University of Oregon

  • Venue:
  • HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Tracing parallel programs to observe their performance introduces intrusion as the result of trace measurement overhead. If post-mortem trace analysis does not compensate for the overhead, the intrusion will lead to errors in the performance results. We show that measurement overhead can be accounted for during trace analysis and intrusion modeled and removed. Algorithms developed in our earlier work [5] are reimplemented in a more robust and modern tool, kojak [12] , allowing them to be applied in large-scale parallel programs. The ability to reduce trace measurement error is demonstrated for a Monte-Carlo simulation based on a master/worker scheme. As an additional result, we visualize how local perturbation propagates across process boundaries and alters the behavioral characteristics of non-local processes.