An early prototype of an autonomic performance environment for exascale

  • Authors:
  • Kevin Huck;Sameer Shende;Allen Malony;Hartmut Kaiser;Allan Porterfield;Rob Fowler;Ron Brightwell

  • Affiliations:
  • University Of Oregon, Eugene, Oregon;University Of Oregon, Eugene, Oregon;University Of Oregon, Eugene, Oregon;Louisiana State University, Baton Rouge, LA;RENCI, Chapel Hill, NC;RENCI, Chapel Hill, NC;Sandia National Labs, Albuquerque, NM

  • Venue:
  • Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Extreme-scale computing requires a new perspective on the role of performance observation in the Exascale system software stack. Because of the anticipated high concurrency and dynamic operation in these systems, it is no longer reasonable to expect that a post-mortem performance measurement and analysis methodology will suffice. Rather, there is a strong need for performance observation that merges first-and third-person observation, in situ analysis, and introspection across stack layers that serves online dynamic feedback and adaptation. In this paper we describe the DOE-funded XPRESS project and the role of autonomic performance support in Exascale systems. XPRESS will build an integrated Exascale software stack (called OpenX) that supports the ParalleX execution model and is targeted towards future Exascale platforms. An initial version of an autonomic performance environment called APEX has been developed for OpenX using the current TAU performance technology and results are presented that highlight the challenges of highly integrative observation and runtime analysis.