Flattening on the Fly: Efficient Handling of MPI Derived Datatypes
Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Applying MPI Derived Datatypes to the NAS Benchmarks: A Case Study
ICPPW '04 Proceedings of the 2004 International Conference on Parallel Processing Workshops
Sparse collective operations for MPI
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Parallel zero-copy algorithms for fast Fourier transform and conjugate gradient using MPI datatypes
EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Using MPI derived datatypes in numerical libraries
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Performance expectations and guidelines for MPI derived datatypes
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Elemental: A New Framework for Distributed Memory Dense Matrix Computations
ACM Transactions on Mathematical Software (TOMS)
In-place algorithms for the symmetric all-to-all exchange with MPI
Proceedings of the 20th European MPI Users' Group Meeting
Hi-index | 0.00 |
In both the regular and the irregular MPI (Message-Passing Interface) collective communication and reduction interfaces there is a correspondence between the argument lists and certain MPI derived datatypes. As a means to address and alleviate well-known memory and performance scalability problems in the irregular (or vector) collective interface definitions of MPI we propose to push this correspondence to its natural limit, and replace the interfaces of the MPI collectives with a different set of interfaces that specify all data sizes and displacements solely by means of derived datatypes. This reduces the number of collective (communication and reduction) interfaces from 16 to 10, significantly generalizes the operations, unifies regular and irregular collective interfaces, makes it possible to decouple certain algorithmic decisions from the collective operation, and moves the interface scalability issue from the collective interfaces to the MPI derived datatypes. To complete the proposal we discuss the memory scalability of the derived datatypes and suggest a number of alternative datatypes for MPI, some of which should be of independent interest. A running example illustrates the benefits of this alternative set of collective interfaces. Implementation issues are discussed showing that an implementation can be undertaken within any reasonable MPI library implementation.