Non-data-communication Overheads in MPI: Analysis on Blue Gene/P

Authors:
Pavan Balaji;Anthony Chan;William Gropp;Rajeev Thakur;Ewing Lusk
Affiliations:
Math. and Comp. Sci. Div., Argonne Nat. Lab., Argonne, USA IL 60439;Dept. of Astronomy and Astrophysics, Univ. of Chicago, Chicago, IL 60637;Dept. of Computer Science, Univ. of Illinois, Urbana, USA IL, 61801;Math. and Comp. Sci. Div., Argonne Nat. Lab., Argonne, USA IL 60439;Math. and Comp. Sci. Div., Argonne Nat. Lab., Argonne, USA IL 60439
Venue:
Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Year:
2008

Citing 3
Cited 6

The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer

Proceedings of the 22nd annual international conference on Supercomputing
IBM System Blue Gene Solution: Blue Gene/P Application Development

IBM System Blue Gene Solution: Blue Gene/P Application Development
Implementation and shared-memory evaluation of MPICH2 over the nemesis communication subsystem

EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface

The Importance of Non-Data-Communication Overheads in MPI

International Journal of High Performance Computing Applications
Using hybrid parallelism to improve memory use in the Uintah framework

Proceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery
Radiation modeling using the Uintah heterogeneous CPU/GPU runtime system

Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond
Parallel I/O, analysis, and visualization of a trillion particle simulation

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Preliminary experiences with the uintah framework on Intel Xeon Phi and stampede

Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery
Analysis of topology-dependent MPI performance on Gemini networks

Proceedings of the 20th European MPI Users' Group Meeting

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern HEC systems, such as Blue Gene/P, rely on achieving high-performance by using the parallelism of a massive number of low-frequency/low-power processing cores. This means that the local pre- and post-communication processing required by the MPI stack might not be very fast, owing to the slow processing cores. Similarly, small amounts of serialization within the MPI stack that were acceptableon small/medium systems can be brutal on massively parallel systems. In this paper, we study different non-data-communication overheads within the MPI implementation on the IBM Blue Gene/P system.