The Importance of Non-Data-Communication Overheads in MPI

Authors:
Pavan Balaji;Anthony Chan;William Gropp;Rajeev Thakur;Ewing Lusk
Affiliations:
MATHEMATICS AND COMPUTER SCIENCE DIVISION, ARGONNE NATIONAL LABORATORY, ARGONNE, IL 60439, USA;MATHEMATICS AND COMPUTER SCIENCE DIVISION, ARGONNE NATIONAL LABORATORY, ARGONNE, IL 60439, USA,.;DEPARTMENT OF COMPUTER SCIENCE, UNIVERSITY OF ILLINOIS, URBANA, IL 61801, USA;MATHEMATICS AND COMPUTER SCIENCE DIVISION, ARGONNE NATIONAL LABORATORY, ARGONNE, IL 60439, USA;MATHEMATICS AND COMPUTER SCIENCE DIVISION, ARGONNE NATIONAL LABORATORY, ARGONNE, IL 60439, USA
Venue:
International Journal of High Performance Computing Applications
Year:
2010

Citing 8
Cited 2

MPI versus MPI+OpenMP on IBM SP for the NAS benchmarks

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Performance Comparison of MPI Implementations over InfiniBand, Myrinet and Quadrics

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer

Proceedings of the 22nd annual international conference on Supercomputing
Early evaluation of IBM BlueGene/P

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Non-data-communication Overheads in MPI: Analysis on Blue Gene/P

Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Toward Efficient Support for Multithreaded MPI Communication

Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Communication analysis of parallel 3D FFT for flat cartesian meshes on large Blue Gene systems

HiPC'08 Proceedings of the 15th international conference on High performance computing
Implementation and shared-memory evaluation of MPICH2 over the nemesis communication subsystem

EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface

A scalable infrastructure for the performance analysis of passive target synchronization

Parallel Computing
A fast and resource-conscious MPI message queue mechanism for large-scale jobs

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

With processor speeds no longer doubling every 18芒聙聰24 months owing to the exponential increase in power consumption and heat dissipation, modern high-end computing systems tend to rely less on the performance of single processing units and instead rely on achieving high performance by using the parallelism of a massive number of low-frequency/low-power processing cores. Using such low-frequency cores, however, puts a premium on end-host pre- and post-communication processing required within communication stacks, such as the Message Passing Interface (MPI) implementation. Similarly, small amounts of serialization within the communication stack that were acceptable on small/medium systems can be brutal on massively parallel systems. Thus, in this paper, we study the different non-data-communication overheads within the MPI implementation on the IBM Blue Gene/P system. Specifically, we analyze various aspects of MPI, including the MPI stack overhead itself, overhead of allocating and queueing requests, queue searches within the MPI stack, multi-request operations, and various others. Our experiments, that scale up to 131,072 cores of the largest Blue Gene/P system in the world (80% of the total system size), reveal several insights into overheads in the MPI stack, which were not previously considered significant, but can have a substantial impact on such massive systems.