Architectural Support for Efficient Multicasting in Irregular Networks

Authors:
Rajeev Sivaram;Ram Kesavan;Dhabaleswar K. Panda;Craig B. Stunkel
Affiliations:
IBM Enterprise Systems Group, Poughkeepsie;Network Appliance, Sunnyvale, CA;Ohio State Univ., Columbus;IBM T.J. Watson Research Center, Yorktown Heights
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
2001

Citing 31
Cited 2

Deadlock-free multicast wormhole routing in multicomputer networks

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Active messages: a mechanism for integrated communication and computation

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Unicast-Based Multicast Communication in Wormhole-Routed Networks

IEEE Transactions on Parallel and Distributed Systems
The SP2 high-performance switch

IBM Systems Journal
U-Net: a user-level network interface for parallel and distributed computing

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
High performance messaging on workstations: Illinois fast messages (FM) for Myrinet

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Early experience with message-passing on the SHRIMP multicomputer

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Multicasting protocols for high-speed, wormhole-routing local area networks

Conference proceedings on Applications, technologies, architectures, and protocols for computer communications
Implementing multidestination worms in switch-based parallel systems: architectural alternatives and their impact

Proceedings of the 24th annual international symposium on Computer architecture
Simulation of modern parallel systems: a CSIM-based approach

Proceedings of the 29th conference on Winter simulation
Efficient Broadcast and Multicast on Multistage Interconnection Networks Using Multiport Encoding

IEEE Transactions on Parallel and Distributed Systems
Multidestination Message Passing in Wormhole k-ary n-cube Networks with Base Routing Conformed Paths

IEEE Transactions on Parallel and Distributed Systems
Multiple Multicast with Minimized Node Contention on Wormhole k-ary n-cube Networks

IEEE Transactions on Parallel and Distributed Systems
Interconnection Networks: An Engineering Approach

Interconnection Networks: An Engineering Approach
Optimal software multicast in wormhole-routed multistage networks

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Collective Communication in Wormhole-Routed Massively Parallel Computers

Computer
Myrinet: A Gigabit-per-Second Local Area Network

IEEE Micro
Optimal Multicast with Packetization and Network Interface Support

ICPP '97 Proceedings of the international Conference on Parallel Processing
Efficient Multicast on Myrinet using Link-Level Flow Control

ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
Where to Provide Support for Efficient Multicasting in Irregular Networks: Network Interface or Switch?

ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
ServerNet Deadlock Avoidance and Fractahedral Topologies

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
A Reliable Hardware Barrier Synchronization Scheme

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Efficient Adaptive Routing in Networks of Workstations with Irregular Topology

CANPC '97 Proceedings of the First International Workshop on Communication and Architectural Support for Network-Based Parallel Computing
Multi-address Encoding for Multicast

PCRCW '94 Proceedings of the First International Workshop on Parallel Computer Routing and Communication
Multicast on Irregular Switch-based Networks with Wormhole Routing

HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
Efficient Broadcast and Multicast on Multistage Interconnnection Networks using Multiport Encoding

SPDP '96 Proceedings of the 8th IEEE Symposium on Parallel and Distributed Processing (SPDP '96)
Optimal Contention-Free Unicast-Based Multicasting in Switch-Based Networks of Workstations

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
HIPIQS: A High-Performance Switch Architecture using Input Queuing

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
(R) Efficient Reliable Multicast on MYRINET

ICPP '96 Proceedings of the Proceedings of the 1996 International Conference on Parallel Processing - Volume 3
Architectural support for efficient communication in scalable parallel systems

Architectural support for efficient communication in scalable parallel systems
Communication mechanisms and algorithms for supporting scalable collective communication on parallel systems

Communication mechanisms and algorithms for supporting scalable collective communication on parallel systems

Efficient Multicast on Irregular Switch-Based Cut-Through Networks with Up-Down Routing

IEEE Transactions on Parallel and Distributed Systems
Minimal broadcasting schemas for the mesh structures

International Journal of High Performance Computing and Networking

Quantified Score

Hi-index	0.00

Visualization

Abstract

Parallel computing on networks of workstations is fast becoming a cost-effective high-performance computing alternative to MPPs. Such a computing environment typically consists of processing nodes interconnected through a switch-based irregular network. Many of the problems that were solved for regular networks have to be solved anew for these systems. One such problem is that of efficient multicast communication. In this paper, we propose two broad categories of schemes for efficient multicasting in such irregular networks: network interface-based (NI-based) and switch-based. The NI-based multicasting schemes use the network interface of intermediate destinations for absorbing and retransmitting messages to other destinations in the multicast tree. In contrast, the switch-based multicasting schemes use hardware support for packet replication at the switches of the network and a concept known as multidestination routing to convey a multicast message from one source to multiple destinations. We first present alternative schemes for efficient multipacket forwarding at the NI and derive an optimal $k \hbox {-} {binomial}$ multicast tree for multipacket NI-based multicast. We then propose two switch-based multicasting schemes that differ in the power of the encoding scheme and the complexity of the decoding logic at the switches. These multicasting schemes use path-based multidestination worms that can cover all nodes connected to switches along a valid unicast path and tree-based multidestination worms that can cover entire destination sets in a single phase using one worm, respectively. For each scheme, we describe the associated header encoding and decoding operation, the method for deriving multidestination worms that cover arbitrary multicast destination sets, and the multicasting scheme using the derived multidestination worms. We then compare the NI-based multicasting scheme to the switch-based multicasting schemes with path-based and tree-based multidestination worms using simulation to determine the system parameters that affect each of the schemes and the range of system parameters for which each scheme performs best. Our results show that the switch-based multicasting scheme using a single tree-based multidestination worm performs the best among the three schemes. However, the NI-based multicasting scheme is capable of delivering high performance compared to the switch-based multicast using path-based worms, especially when the software overhead at the network interface is less than half of the overhead at the host. We therefore conclude that support for multicast at the NI is an important first step to improving multicast performance. However, there is still considerable gain that can be achieved by supporting hardware multicast in switches. Finally, while supporting such hardware multicast, it is better to support schemes that can achieve multicast in one phase.