Scalable detection of MPI-2 remote memory access inefficiency patterns

  • Authors:
  • Marc-André Hermanns;Markus Geimer;Bernd Mohr;Felix Wolf

  • Affiliations:
  • Laboratory for Parallel Programming, German Research School for Simulation Sciences, Germany;Forschungszentrum Jülich, Jülich Supercomputing Centre, Germany;Forschungszentrum Jülich, Jülich Supercomputing Centre, Germany;Forschungszentrum Jülich, Jülich Supercomputing Centre, Germany, Laboratory for Parallel Programming, German Research School for Simulation Sciences, Germany, Department of Computer Scie ...

  • Venue:
  • International Journal of High Performance Computing Applications
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Wait states in parallel applications can be identified by scanning event traces for characteristic patterns. In our earlier work we defined such inefficiency patterns for MPI-2 one-sided communication, although still based on a serial trace-analysis scheme with limited scalability. In this article we show how wait states in one-sided communications can be detected in a more scalable fashion by taking advantage of a new scalable trace-analysis approach based on a parallel replay, which was originally developed for MPI-1 point-to-point and collective communication. Moreover, we demonstrate the scalability of our method and its usefulness for the optimization cycle with applications running on up to 32,768 cores.