Implementation and usage of the PERUSE-Interface in open MPI

Authors:
Rainer Keller;George Bosilca;Graham Fagg;Michael Resch;Jack J. Dongarra
Affiliations:
High-Performance Computing Center, University of Stuttgart;Innovative Computing Laboratory, University of Tennessee;Innovative Computing Laboratory, University of Tennessee;High-Performance Computing Center, University of Stuttgart;Innovative Computing Laboratory, University of Tennessee
Venue:
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Year:
2006

Citing 2
Cited 4

Performance Optimization for Large Scale Computing: The Scalable VAMPIR Approach

ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Performance Analysis of Multilevel Parallel Applications on Shared Memory Architectures

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing

Characteristics of the unexpected message queue of MPI applications

EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Multi-scale analysis of large distributed computing systems

Proceedings of the third international workshop on Large-scale system and application performance
Detection and analysis of resource usage anomalies in large distributed systems through multi-scale visualization

Concurrency and Computation: Practice & Experience
MPI vs. bittorrent: switching between large-message broadcast algorithms in the presence of bottleneck links

Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes the implementation, usage and experience with the MPI performance revealing extension interface (Peruse) into the Open MPI implementation. While the PMPI-interface allows timing MPI-functions through wrappers, it can not provide MPI-internal information on MPI-states and lower-level network performance. We introduce the general design criteria of the interface implementation and analyze the overhead generated by this functionality. To support performance evaluation of large-scale applications, tools for visualization are imperative. We extend the tracing library of the Paraver-toolkit to support tracing Peruse-events and show how this helps detecting performance bottlenecks. A test-suite and a real-world application are traced and visualized using Paraver.