Process cooperation in multiple message broadcast

Authors:
Bin Jia
Affiliations:
IBM Advanced Clustering Technology Team, Poughkeepsie, NY
Venue:
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Year:
2007

Citing 8
Cited 2

Optimum Broadcasting and Personalized Communication in Hypercubes

IEEE Transactions on Computers
MPI-The Complete Reference, Volume 1: The MPI Core

MPI-The Complete Reference, Volume 1: The MPI Core
A Bandwidth Latency Tradeoff for Broadcast and Reduction

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Pipelining and Overlapping for MPI Collective Operations

LCN '03 Proceedings of the 28th Annual IEEE International Conference on Local Computer Networks
On optimizing collective communication

CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
Pipelined broadcast on ethernet switched clusters

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Optimal multiple message broadcasting in telephone-like communication systems

SPDP '94 Proceedings of the 1994 6th IEEE Symposium on Parallel and Distributed Processing
Optimal broadcast for fully connected networks

HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications

Optimal broadcast for fully connected processor-node networks

Journal of Parallel and Distributed Computing
Two-tree algorithms for full bandwidth broadcast, reduction and scan

Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a process cooperation algorithm for broadcasting m messages among n processes, m ≥ 1, n ≥ 1, in one-port fully-connected communication systems. In this algorithm, the n processes are organized into 2⌊log n⌋ one- or two-process units. Messages are broadcast among the units according to a basic communication schedule. Processes in each two-process unit cooperate to carry out the basic schedule in a way that at any step, either process has at most one message that the other has not received. This algorithm completes the broadcast in ⌈log n⌉+m-1 communication steps, which is theoretically optimal. Empirical study shows that it outperforms other widely used algorithms significantly when the data to broadcast is large. Efficient communication schedule construction is a salient feature of this algorithm. Both the basic schedule and the cooperation schedule are constructed in O(log n) bitwise operations on process ranking.