Understanding the formation of wait states in applications with one-sided communication

Authors:
Marc-André Hermanns;Manfred Miklosch;David Böhme;Felix Wolf
Affiliations:
RWTH Aachen University, Aachen, Germany;University of Hagen, Hagen, Germany;RWTH Aachen University, Aachen, Germany;RWTH Aachen University, Aachen, Germany
Venue:
Proceedings of the 20th European MPI Users' Group Meeting
Year:
2013

Citing 17
Cited 0

Waiting time analysis and performance visualization in Carnival

SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
An online computation of critical path profiling

SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Using cause-effect analysis to understand the performance of distributed programs

SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
The Visual Display of Parallel Performance Data

Computer
The Paradyn Parallel Performance Measurement Tool

Computer
Near-Critical Path Analysis of Program Activity Graphs

MASCOTS '94 Proceedings of the Second International Workshop on Modeling, Analysis, and Simulation On Computer and Telecommunication Systems
The Tau Parallel Performance System

International Journal of High Performance Computing Applications
The implementation of the finite-volume dynamical core in the community atmosphere model

Journal of Computational and Applied Mathematics
The Scalasca performance toolset architecture

Concurrency and Computation: Practice & Experience - Scalable Tools for High-End Computing
Scaling molecular dynamics to 3000 processors with projections: a performance analysis case study

ICCS'03 Proceedings of the 2003 international conference on Computational science
GASP! a standardized performance analysis tool interface for global address space programming models

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Scalable Identification of Load Imbalance in Parallel Executions Using Call Path Profiles

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Parallel Performance Wizard: A Performance System for the Analysis of Partitioned Global-Address-Space Applications

International Journal of High Performance Computing Applications
Identifying the Root Causes of Wait States in Large-Scale Parallel Applications

ICPP '10 Proceedings of the 2010 39th International Conference on Parallel Processing
Scalable detection of MPI-2 remote memory access inefficiency patterns

International Journal of High Performance Computing Applications
Scalable Critical-Path Based Performance Analysis

IPDPS '12 Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium
Efficient MPI implementation of a parallel, stable merge algorithm

EuroMPI'12 Proceedings of the 19th European conference on Recent Advances in the Message Passing Interface

Quantified Score

Hi-index	0.00

Visualization

Abstract

To better understand the formation of wait states in MPI programs and to support the user in finding optimization targets in the case of load imbalance, a major source of wait states, we added in our earlier work two new trace-analysis techniques to Scalasca, a performance analysis tool designed for large-scale applications. In this paper, we show how the two techniques, which were originally restricted to two-sided and collective MPI communication, are extended to cover also one-sided communication. We demonstrate our experiences with benchmark programs and a mini-application representing the core of the POP ocean model.