Computing Global Combine Operations in the Multiport Postal Model

Authors:
Amotz Bar-Noy;Jehoshua Bruck;Ching-Tien Ho;Shlomo Kipnis;Baruch Schieber
Affiliations:
-;-;-;-;-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
1995

Citing 14
Cited 4

Solving problems on concurrent processors. Vol. 1: General techniques and regular problems

Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
Optimum Broadcasting and Personalized Communication in Hypercubes

IEEE Transactions on Computers
Intensive hypercube communication. Prearranged communication in link-bound machines

Journal of Parallel and Distributed Computing
The network architecture of the Connection Machine CM-5 (extended abstract)

SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
LogP: towards a realistic model of parallel computation

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
The AURORA gigabit testbed

Computer Networks and ISDN Systems - Special issue on high speed networks
Optimal broadcast and summation in the LogP model

SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
The IBM external user interface for scalable parallel systems

Parallel Computing - Special issue: message passing interfaces
Efficient algorithms for all-to-all communications in multi-port message-passing systems

SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Designing broadcasting algorithms in the Postal Model for message-passing systems

Proceedings of the 4th ACM symposium on Parallel algorithms and architectures
Optimal computation of census functions in the postal model

Discrete Applied Mathematics
Architecture and Implementation of Vulcan

Proceedings of the 8th International Symposium on Parallel Processing
CCL: A Portable and Tunable Collective Communication Library for Scalable Parallel Computers

Proceedings of the 8th International Symposium on Parallel Processing
Document for a Standard Message-Passing Interface

Document for a Standard Message-Passing Interface

An Extended Dominating Node Approach to Broadcast and Global Combine in Multiport Wormhole-Routed Mesh Networks

IEEE Transactions on Parallel and Distributed Systems
Modeling parallel bandwidth: local vs. global restrictions

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Optimal broadcast for fully connected processor-node networks

Journal of Parallel and Distributed Computing
Bandwidth optimal all-reduce algorithms for clusters of workstations

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Consider a message-passing system of n processors, in which each processor holds one piece of data initially. The goal is to compute an associative and commutative reduction function on the n pieces of data and to make the result known to all the n processors. This operation is frequently used in many message-passing systems and is typically referred to as global combine, census computation, or gossiping. This paper explores the problem of global combine in the multiport postal model. This model is characterized by three parameters: n驴the number of processors, k驴the number of ports per processor, and 驴驴the communication latency. In this model, in every round r, each processor can send k distinct messages to k other processors, and it can receive k messages that were sent from k other processors 驴驴 1 rounds earlier. This paper provides an optimal algorithm for the global combine problem that requires the least number of communication rounds and minimizes the time spent by any processor in sending and receiving messages.