Deadlock-Free Message Routing in Multiprocessor Interconnection Networks
IEEE Transactions on Computers
Optimum Broadcasting and Personalized Communication in Hypercubes
IEEE Transactions on Computers
Deadlock-free multicast wormhole routing in multicomputer networks
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Chaos router: architecture and performance
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
The network architecture of the Connection Machine CM-5 (extended abstract)
SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Computer organization & design: the hardware/software interface
Computer organization & design: the hardware/software interface
Unicast-Based Multicast Communication in Wormhole-Routed Networks
IEEE Transactions on Parallel and Distributed Systems
Meiko CS-2 interconnect Elan-Elite design
Parallel Computing - Special double issue: SUPRENUM and GENESIS
The SP2 high-performance switch
IBM Systems Journal
Proceedings of the 24th annual international symposium on Computer architecture
Simulation of modern parallel systems: a CSIM-based approach
Proceedings of the 29th conference on Winter simulation
Optimal software multicast in wormhole-routed multistage networks
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
A Theory of Deadlock-Free Adaptive Multicast Routing in Wormhole Networks
IEEE Transactions on Parallel and Distributed Systems
Adaptive Source Routing in Multistage Interconnection Networks
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
A Reliable Hardware Barrier Synchronization Scheme
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Architecture and Implementation of Vulcan
Proceedings of the 8th International Symposium on Parallel Processing
CCL: A Portable and Tunable Collective Communication Library for Scalable Parallel Computers
Proceedings of the 8th International Symposium on Parallel Processing
Efficient software multicast in wormhole-routed unidirectional multistage networks
SPDP '95 Proceedings of the 7th IEEE Symposium on Parallel and Distributeed Processing
Optimal Broadcasting in Mesh-Connected Architectures
Optimal Broadcasting in Mesh-Connected Architectures
Multidestination Message Passing in Wormhole k-ary n-cube Networks with Base Routing Conformed Paths
IEEE Transactions on Parallel and Distributed Systems
Optimal All-to-All Personalized Exchange in Self-Routable Multistage Networks
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Architectural Support for Efficient Multicasting in Irregular Networks
IEEE Transactions on Parallel and Distributed Systems
HIPIQS: A High-Performance Switch Architecture Using Input Queuing
IEEE Transactions on Parallel and Distributed Systems
Nonblocking k-Fold Multicast Networks
IEEE Transactions on Parallel and Distributed Systems
Nonblocking k-Fold Multicast Networks
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Multicast Performance of Multistage Interconnection Networks with Shared Buffering
ICN '01 Proceedings of the First International Conference on Networking-Part 1
HCW '99 Proceedings of the Eighth Heterogeneous Computing Workshop
HIPIQS: A High-Performance Switch Architecture using Input Queuing
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Communication Models for a Free-Space Optical Cross-Connect Switch
The Journal of Supercomputing
A Service-Centric Multicast Architecture and Routing Protocol
IEEE Transactions on Parallel and Distributed Systems
Minimal broadcasting schemas for the mesh structures
International Journal of High Performance Computing and Networking
Virtual Circuit Tree Multicasting: A Case for On-Chip Hardware Multicast Support
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
An analytical model for the performance of buffered multicast banyan networks
Computer Communications
Hi-index | 0.00 |
This paper proposes a new approach for implementing fast multicast and broadcast in unidirectional and bidirectional multistage interconnection networks (MINs) with multiport encoded multidestination worms. For a MIN with n stages, such worms use n header flits each. One flit is used for each stage of the network and it indicates the output ports to which a multicast message needs to be replicated. A multiport encoded worm with (d1, d2 ..., dn, 1 驴di驴k) degrees of replication for the respective stages is capable of covering (d1脳d2脳 ... 脳dn) destinations with a single communication start-up. In this paper, a switch architecture is proposed for implementing multidestination worms without deadlock. Three grouping algorithms of varying complexity are presented to derive the associated multiport encoded worms for a multicast to an arbitrary set of destinations. Using these worms, a multinomial tree-based scheme is proposed to implement the multicast. This scheme significantly reduces broadcast/multicast latency compared to schemes using unicast messages. Simulation studies for both unidirectional and bidirectional MIN systems indicate that improvement in broadcast/multicast latency up to a factor of four is feasible using the new approach. Interestingly, this approach is able to implement multicast with reduced latency as the number of destinations increases beyond a certain number. Compared to implementing unicast messages, this approach requires little additional logic at the switches. Thus, the scheme demonstrates significant potential for implementing efficient collective communication operations on current and future MIN-based systems.