Efficient allgather for regular SMP-Clusters

Authors:
Jesper Larsson Träff
Affiliations:
C&C Research Laboratories, NEC Europe Ltd., Sankt Augustin, Germany
Venue:
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Year:
2006

Citing 9
Cited 5

Optimum Broadcasting and Personalized Communication in Hypercubes

IEEE Transactions on Computers
Gossiping in minimal time

SIAM Journal on Computing
Methods and problems of communication in usual networks

Proceedings of the international workshop on Broadcasting and gossiping 1990
Efficient Algorithms for All-to-All Communications in Multiport Message-Passing Systems

IEEE Transactions on Parallel and Distributed Systems
MagPIe: MPI's collective communication operations for clustered wide area systems

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
MPI-The Complete Reference, Volume 1: The MPI Core

MPI-The Complete Reference, Volume 1: The MPI Core
MPICH-G2: a Grid-enabled implementation of the Message Passing Interface

Journal of Parallel and Distributed Computing - Special issue on computational grids
Optimization of MPI collective communication on BlueGene/L systems

Proceedings of the 19th annual international conference on Supercomputing
Collective operations in NEC's high-performance MPI libraries

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing

A Simple, Pipelined Algorithm for Large, Irregular All-gather Problems

Proceedings of the 15th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Efficient shared memory and RDMA based collectives on multi-rail QsNetII SMP clusters

Cluster Computing
A Pipelined Algorithm for Large, Irregular All-Gather Problems

International Journal of High Performance Computing Applications
Self--consistent MPI performance requirements

PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Assessing the performance and scalability of a novel multilevel k-nomial allgather on CORE-Direct systems

Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We show how to adapt and extend a well-known allgather (all-to-all broadcast) algorithm to parallel systems with a hierarchical communication system such as clusters of SMP nodes. For small problem sizes, the new algorithm requires a logarithmic number of communication rounds in the number of SMP nodes, and gracefully degrades towards a linear algorithm as problem size increases. The algorithm has been used to implement the MPI_Allgather collective operation of MPI in the MPI/SX library. Performance measurements on a 72 node SX-8 system shows that graceful degradation provides a smooth transition from logarithmic to linear behavior, and significantly outperforms a standard, linear algorithm. The performance of the latter is furthermore highly sensitive to the distribution of MPI processes over the physical processors.