Deadlock-Free Message Routing in Multiprocessor Interconnection Networks
IEEE Transactions on Computers
Optimum Broadcasting and Personalized Communication in Hypercubes
IEEE Transactions on Computers
Deadlock-free multicast wormhole routing in multicomputer networks
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks
IEEE Transactions on Parallel and Distributed Systems
Unicast-Based Multicast Communication in Wormhole-Routed Networks
IEEE Transactions on Parallel and Distributed Systems
Proceedings of the 24th annual international symposium on Computer architecture
Simulation of modern parallel systems: a CSIM-based approach
Proceedings of the 29th conference on Winter simulation
Resource Deadlocks and Performance of Wormhole Multicast Routing Algorithms
IEEE Transactions on Parallel and Distributed Systems
Multidestination Message Passing in Wormhole k-ary n-cube Networks with Base Routing Conformed Paths
IEEE Transactions on Parallel and Distributed Systems
Using CSIM to model complex systems
WSC '88 Proceedings of the 20th conference on Winter simulation
Building a high-performance collective communication library
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Optimal Multicast Communication in Wormhole-Routed Torus Networks
IEEE Transactions on Parallel and Distributed Systems
Optimal Multicast with Packetization and Network Interface Support
ICPP '97 Proceedings of the international Conference on Parallel Processing
ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
Multidestination Message Passing Mechanism Conforming to Base Wormhole Routing Scheme
PCRCW '94 Proceedings of the First International Workshop on Parallel Computer Routing and Communication
Multicast on Irregular Switch-based Networks with Wormhole Routing
HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
MPI: A Message-Passing Interface
MPI: A Message-Passing Interface
Architectural Support for Efficient Multicasting in Irregular Networks
IEEE Transactions on Parallel and Distributed Systems
Efficient Multicast on Irregular Switch-Based Cut-Through Networks with Up-Down Routing
IEEE Transactions on Parallel and Distributed Systems
Fault-Tolerant Broadcasting in Wormhole-Routed Torus Networks
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
HCW '99 Proceedings of the Eighth Heterogeneous Computing Workshop
Multi-Node Multicast in Three and Higher Dimensional Wormhole Tori and Meshes with Load Balance
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Pseudo-cycle-based multicast routing in wormhole-routed networks
Journal of Computer Science and Technology
Efficient Multiple Multicast on Heterogeneous Network of Workstations
The Journal of Supercomputing
Journal of Systems Architecture: the EUROMICRO Journal
Hi-index | 0.00 |
This paper presents a new approach to minimize node contention while performing multiple multicast/broadcast on wormhole $k$-ary $n$-cube networks with overlapped destination sets. The existing multicast algorithms in the literature deliver poor performance under multiple multicast because these algorithms have been designed with only single multicast in mind. The new algorithms introduced in this paper do not use any global knowledge about the respective destination sets of the concurrent multicasts. Instead, only local information and a source-specific partitioning approach are used. For systems supporting unicast message-passing, a new SPUmesh (Source-Partitioned Umesh) algorithm is proposed and is shown to be superior than the conventional Umesh algorithm [2] for multiple multicast. Two different algorithms, SQHL (Source-Quadrant Hierarchical Leader) and SCHL (Source-Centered Hierarchical Leader), are proposed for systems with multidestination message-passing and shown to be superior than the HL scheme [3]. All of these algorithms perform 1) 5-10 times faster than the existing algorithms under multiple multicast and 2) as fast as existing algorithms under single multicast. Furthermore, the SCHL scheme demonstrates that the latency of multiple multicast can, in fact, be reduced as the degree of multicast increases beyond a certain number. Thus, these algorithms demonstrate significant potential to be used for designing fast and scalable collective communication libraries on current and future generation wormhole systems.