Performance Optimization for Large Scale Computing: The Scalable VAMPIR Approach
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Performance Analysis of Multilevel Parallel Applications on Shared Memory Architectures
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Characteristics of the unexpected message queue of MPI applications
EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Multi-scale analysis of large distributed computing systems
Proceedings of the third international workshop on Large-scale system and application performance
Concurrency and Computation: Practice & Experience
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Hi-index | 0.00 |
This paper describes the implementation, usage and experience with the MPI performance revealing extension interface (Peruse) into the Open MPI implementation. While the PMPI-interface allows timing MPI-functions through wrappers, it can not provide MPI-internal information on MPI-states and lower-level network performance. We introduce the general design criteria of the interface implementation and analyze the overhead generated by this functionality. To support performance evaluation of large-scale applications, tools for visualization are imperative. We extend the tracing library of the Paraver-toolkit to support tracing Peruse-events and show how this helps detecting performance bottlenecks. A test-suite and a real-world application are traced and visualized using Paraver.