Pipelined parallel prefix computations, and sorting on a pipelined hypercube
Journal of Parallel and Distributed Computing
Optimal multiple message broadcasting in telephone-like communication systems
Discrete Applied Mathematics
MPI-The Complete Reference, Volume 1: The MPI Core
MPI-The Complete Reference, Volume 1: The MPI Core
Performance Analysis of MPI Collective Operations
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 15 - Volume 16
On optimizing collective communication
CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
Collective operations in NEC's high-performance MPI libraries
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Parallel prefix (scan) algorithms for MPI
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
An optimal broadcast algorithm adapted to SMP clusters
PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Optimal broadcast for fully connected networks
HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
Hi-index | 0.00 |
We present a new, simple algorithmic idea for exploiting the capability for bidirectional communication present in many modern interconnects for the collective MPI operations broadcast, reduction and scan. Our algorithms achieve up to twice the bandwidth of most previous and commonly used algorithms. In particular, our algorithms for reduction and scan are the currently best known. Experiments on clusters with Myrinet and InfiniBand interconnects show significant reductions in running time for broadcast and reduction, for reduction even close to the best possible factor of two.