A case for standard non-blocking collective operations

Authors:
Torsten Hoefler;Prabhanjan Kambadur;Richard L. Graham;Galen Shipman;Andrew Lumsdaine
Affiliations:
Open Systems Lab, Indiana University, Bloomington, IN and Chemnitz University of Technology, Chemnitz, Germany;Open Systems Lab, Indiana University, Bloomington, IN;National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, TN;Advanced Computing Laboratory, Los Alamos National Laboratory, Los Alamos, NM;Open Systems Lab, Indiana University, Bloomington, IN
Venue:
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Year:
2007

Citing 8
Cited 8

Automatically tuned collective communications

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
MPI/RT --- An Emerging Standard for High-Performance Real-Time Systems

HICSS '98 Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences - Volume 3
A Framework for Collective Personalized Communication

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Analyzing the Impact of Overlap, Offload, and Independent Progress for Message Passing Interface Applications

International Journal of High Performance Computing Applications
Transformations to Parallel Codes for Communication-Computation Overlap

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Implementation and performance analysis of non-blocking collective operations for MPI

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Issues in developing a thread-safe MPI implementation

EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Optimizing a conjugate gradient solver with non-blocking collective operations

EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface

Leveraging non-blocking collective communication in high-performance applications

Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Sparse Non-blocking Collectives in Quantum Mechanical Calculations

Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Towards Efficient MapReduce Using MPI

Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Hiding latency in Coarray Fortran 2.0

Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model
pupyMPI - MPI implemented in pure python

EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Trace-based performance analysis for the petascale simulation code FLASH

International Journal of High Performance Computing Applications
Distributed adaptive routing for big-data applications running on data center networks

Proceedings of the eighth ACM/IEEE symposium on Architectures for networking and communications systems
Concurrent programming constructs for parallel MPI applications

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we make the case for adding standard nonblocking collective operations to the MPI standard. The nonblocking point-to-point and blocking collective operations currently defined by MPI provide important performance and abstraction benefits. To allow these benefits to be simultaneously realized, we present an application programming interface for non-blocking collective operations in MPI. Microbenchmark and application-based performance results demonstrate that non-blocking collective operations offer not only improved convenience, but improved performance as well, when compared to manual use of threads with blocking collectives.