Communication effect basic linear algebra computations on hypercube architectures
Journal of Parallel and Distributed Computing
Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
Optimum Broadcasting and Personalized Communication in Hypercubes
IEEE Transactions on Computers
Nonblocking Broadcast Switching Networks
IEEE Transactions on Computers
Efficient algorithms for all-to-all communications in multi-port message-passing systems
SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
All-to-All Personalized Communication in a Wormhole-Routed Torus
IEEE Transactions on Parallel and Distributed Systems
Fast Gossiping on Mesh-Bus Computers
IEEE Transactions on Computers
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
A Class of Interconnection Networks for Multicasting
IEEE Transactions on Computers
Time-Optimal Gossip of Large Packets in Noncombining 2D Tori and Meshes
IEEE Transactions on Parallel and Distributed Systems
Optimal All-to-All Personalized Exchange in Self-Routable Multistage Networks
IEEE Transactions on Parallel and Distributed Systems
Configurable Algorithms for Complete Exchange in 2D Meshes
IEEE Transactions on Parallel and Distributed Systems
Optimal All-to-All Personalized Exchange in a Class of Optical Multistage Networks
IEEE Transactions on Parallel and Distributed Systems
Interconnection Networks: An Engineering Approach
Interconnection Networks: An Engineering Approach
All-To-All Communication with Minimum Start-Up Costs in 2D/3D Tori and Meshes
IEEE Transactions on Parallel and Distributed Systems
Efficient All-to-All Personalized Exchange in Multidimensional Torus Networks
ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
All-to-All Communication on Meshes with Wormhole Routing
Proceedings of the 8th International Symposium on Parallel Processing
Near-Optimal All-to-All Broadcast in Multidimensional All-Port Meshes and Tori
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
All-to-all broadcast in torus with wormhole-like routing
SPDP '95 Proceedings of the 7th IEEE Symposium on Parallel and Distributeed Processing
Total-Exchange on Wormhole k-ary n-cubes with Adaptive Routing
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Near-Optimal All-to-All Broadcast in Multidimensional All-Port Meshes and Tori
IEEE Transactions on Parallel and Distributed Systems
A New Conference Network for Group Communication
IEEE Transactions on Computers
An optimal broadcasting schema for multidimensional mesh structures
Proceedings of the 2003 ACM symposium on Applied computing
Quantum Memory Hierarchies: Efficient Designs to Match Available Parallelism in Quantum Computing
Proceedings of the 33rd annual international symposium on Computer Architecture
Minimal broadcasting schemas for the mesh structures
International Journal of High Performance Computing and Networking
A message passing strategy for array redistributions in a torus network
The Journal of Supercomputing
International Journal of Computer Mathematics
Comparison of SBA – family task allocation algorithms for mesh structured networks
ISPA'06 Proceedings of the 2006 international conference on Frontiers of High Performance Computing and Networking
Comparison of allocation algorithms for mesh structured networks with using multistage simulation
ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part V
Static and dynamic allocation algorithms in mesh structured networks
ICDCIT'06 Proceedings of the Third international conference on Distributed Computing and Internet Technology
Simulation-based evaluation of distributed mesh allocation algorithms
ISPA'07 Proceedings of the 2007 international conference on Frontiers of High Performance Computing and Networking
Energy-aware routing in hybrid optical network-on-chip for future multi-processor system-on-chip
Journal of Parallel and Distributed Computing
Hi-index | 15.00 |
All-to-all communication is one of the most dense communication patterns and occurs in many important applications in parallel computing. In this paper, we present a new all-to-all broadcast algorithm in all-port meshes and tori. The algorithm utilizes a controlled message flooding based on a novel broadcast pattern, which ensures a balanced traffic load in all dimensions in the network so that the optimal transmission time for all-to-all broadcast can be achieved. The broadcast pattern is described in a formal, generic way for each node in terms of a few simple operations and can be easily built into router hardware. Unlike existing all-to-all broadcast algorithms, the new algorithm overlaps message switching time with transmission time in a pipelined fashion to reduce the total communication delay of all-to-all broadcast. In most cases, the total communication delay is close to the lower bound of all-to-all broadcast within a small constant range. Finally, the algorithm is conceptually simple and symmetrical for every message and every node so that it can be easily implemented in hardware and achieves the optimum in practice.