Improving the scalability of performance evaluation tools

  • Authors:
  • Sameer Suresh Shende;Allen D. Malony;Alan Morris

  • Affiliations:
  • Performance Research Laboratory, Department of Computer and Information Science, University of Oregon, Eugene, OR;Performance Research Laboratory, Department of Computer and Information Science, University of Oregon, Eugene, OR;Performance Research Laboratory, Department of Computer and Information Science, University of Oregon, Eugene, OR

  • Venue:
  • PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume 2
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Performance evaluation tools play an important role in helping understand application performance, diagnose performance problems and guide tuning decisions on modern HPC systems. Tools to observe parallel performance must evolve to keep pace with the ever-increasing complexity of these systems. In this paper, we describe our experience in building novel tools and techniques in the TAU Performance System® to observe application performance effectively and efficiently at scale. It describes the extensions to TAU to contend with large data volumes associated with increasing core counts. These changes include new instrumentation choices, efficient handling of disk I/O operations in the measurement layer, and strategies for visualization of performance data at scale in TAU's analysis layer, among others. We also describe some techniques that allow us to fully characterize the performance of applications running on hundreds of thousands of cores.