PARSE 2.0: A Tool for Parallel Application Run Time Behavior Evaluation

  • Authors:
  • Jeffrey J. Evans;Charles E. Lucas

  • Affiliations:
  • -;-

  • Venue:
  • ICDCSW '11 Proceedings of the 2011 31st International Conference on Distributed Computing Systems Workshops
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

Run time variability of parallel applications continues to be a significant challenge in high performance computing (HPC) systems. We are currently studying run time variability in the context of both systemic performance and energy management. Our perspective is from that of the application, focusing on the interactions of the inter-process communication system on the set of concurrently executing parallel applications. In such a scenario, application run time can be extended and become highly variable. While some applications may be more sensitive to these interactions, others may in fact be generating the interactions that cause inconsistent run time, thus forming the notion of application-level behavioral attributes. To gain insight into this problem, our earlier work developed a framework that emulates parallel applications, called PACE. We also introduced a Parallel Application Run time Sensitivity Evaluation (PARSE) function that uses the PACE framework to study the run time effects of controlled network performance degradation on applications. Inter-process communication has evolved over the last decade from network communication between single-processor, single-core nodes to hybrid systems whose compute nodes contain several multi-core processor units. Motivated by the evolution of compute hardware and systems software, this work introduces PARSE 2.0, which is a nearly complete re-write that extends PARSE capabilities to include fully automating the processes of evaluating and quantifying run time critical parallel application-level behavioral attributes. We present an overview of the tool and the attributes being evaluated, and present experimental results from tests conducted on several widely used parallel benchmarks and application code fragments. The results re-enforce our earlier work, demonstrating that parallel applications can be classified according to their behavioral attributes, in the context of communication system resources.