Pipelined All-to-All Broadcast in All-Port Meshes and Tori

Authors:
Yuanyuan Yang;Jianchao Wang
Affiliations:
State Univ. of Stony Brook, New York, NY;DataTreasury Corp., Melville, NY
Venue:
IEEE Transactions on Computers
Year:
2001

Citing 21
Cited 12

Communication effect basic linear algebra computations on hypercube architectures

Journal of Parallel and Distributed Computing
Solving problems on concurrent processors. Vol. 1: General techniques and regular problems

Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
Optimum Broadcasting and Personalized Communication in Hypercubes

IEEE Transactions on Computers
Nonblocking Broadcast Switching Networks

IEEE Transactions on Computers
Efficient algorithms for all-to-all communications in multi-port message-passing systems

SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
All-to-All Personalized Communication in a Wormhole-Routed Torus

IEEE Transactions on Parallel and Distributed Systems
Fast Gossiping on Mesh-Bus Computers

IEEE Transactions on Computers
Bandwidth-Optimal Complete Exchange on Wormhole-Routed 2D/3D Torus Networks: A Diagonal-Propagation Approach

IEEE Transactions on Parallel and Distributed Systems
Gossiping on Meshes and Tori

IEEE Transactions on Parallel and Distributed Systems
A Class of Interconnection Networks for Multicasting

IEEE Transactions on Computers
Time-Optimal Gossip of Large Packets in Noncombining 2D Tori and Meshes

IEEE Transactions on Parallel and Distributed Systems
Optimal All-to-All Personalized Exchange in Self-Routable Multistage Networks

IEEE Transactions on Parallel and Distributed Systems
Configurable Algorithms for Complete Exchange in 2D Meshes

IEEE Transactions on Parallel and Distributed Systems
Optimal All-to-All Personalized Exchange in a Class of Optical Multistage Networks

IEEE Transactions on Parallel and Distributed Systems
Interconnection Networks: An Engineering Approach

Interconnection Networks: An Engineering Approach
All-To-All Communication with Minimum Start-Up Costs in 2D/3D Tori and Meshes

IEEE Transactions on Parallel and Distributed Systems
Efficient All-to-All Personalized Exchange in Multidimensional Torus Networks

ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
All-to-All Communication on Meshes with Wormhole Routing

Proceedings of the 8th International Symposium on Parallel Processing
Near-Optimal All-to-All Broadcast in Multidimensional All-Port Meshes and Tori

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
All-to-all broadcast in torus with wormhole-like routing

SPDP '95 Proceedings of the 7th IEEE Symposium on Parallel and Distributeed Processing
Total-Exchange on Wormhole k-ary n-cubes with Adaptive Routing

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium

Near-Optimal All-to-All Broadcast in Multidimensional All-Port Meshes and Tori

IEEE Transactions on Parallel and Distributed Systems
A New Conference Network for Group Communication

IEEE Transactions on Computers
An optimal broadcasting schema for multidimensional mesh structures

Proceedings of the 2003 ACM symposium on Applied computing
Quantum Memory Hierarchies: Efficient Designs to Match Available Parallelism in Quantum Computing

Proceedings of the 33rd annual international symposium on Computer Architecture
Minimal broadcasting schemas for the mesh structures

International Journal of High Performance Computing and Networking
A message passing strategy for array redistributions in a torus network

The Journal of Supercomputing
A message combining approach for efficient array redistribution in non-all-to-all communication networks

International Journal of Computer Mathematics
Comparison of SBA – family task allocation algorithms for mesh structured networks

ISPA'06 Proceedings of the 2006 international conference on Frontiers of High Performance Computing and Networking
Comparison of allocation algorithms for mesh structured networks with using multistage simulation

ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part V
Static and dynamic allocation algorithms in mesh structured networks

ICDCIT'06 Proceedings of the Third international conference on Distributed Computing and Internet Technology
Simulation-based evaluation of distributed mesh allocation algorithms

ISPA'07 Proceedings of the 2007 international conference on Frontiers of High Performance Computing and Networking
Energy-aware routing in hybrid optical network-on-chip for future multi-processor system-on-chip

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	15.00

Visualization

Abstract

All-to-all communication is one of the most dense communication patterns and occurs in many important applications in parallel computing. In this paper, we present a new all-to-all broadcast algorithm in all-port meshes and tori. The algorithm utilizes a controlled message flooding based on a novel broadcast pattern, which ensures a balanced traffic load in all dimensions in the network so that the optimal transmission time for all-to-all broadcast can be achieved. The broadcast pattern is described in a formal, generic way for each node in terms of a few simple operations and can be easily built into router hardware. Unlike existing all-to-all broadcast algorithms, the new algorithm overlaps message switching time with transmission time in a pipelined fashion to reduce the total communication delay of all-to-all broadcast. In most cases, the total communication delay is close to the lower bound of all-to-all broadcast within a small constant range. Finally, the algorithm is conceptually simple and symmetrical for every message and every node so that it can be easily implemented in hardware and achieves the optimum in practice.