Recursive partitioning multicast: A bandwidth-efficient routing for Networks-on-Chip

Authors:
Lei Wang; Yuho Jin; Hyungjun Kim;Eun Jung Kim
Affiliations:
Department of Computer Science and Engineering, Texas A&MUniversity, College Station, 77843 USA;Department of Computer Science and Engineering, Texas A&MUniversity, College Station, 77843 USA;Department of Computer Science and Engineering, Texas A&MUniversity, College Station, 77843 USA;Department of Computer Science and Engineering, Texas A&MUniversity, College Station, 77843 USA
Venue:
NOCS '09 Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip
Year:
2009

Citing 17
Cited 8

Deadlock-Free Message Routing in Multiprocessor Interconnection Networks

IEEE Transactions on Computers
Reliable broadcast protocols

ACM Transactions on Computer Systems (TOCS)
Multicast Communication in Multicomputer Networks

IEEE Transactions on Parallel and Distributed Systems
Multi-address Encoding for Multicast

PCRCW '94 Proceedings of the First International Workshop on Parallel Computer Routing and Communication
Orion: a power-performance simulator for interconnection networks

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Scalar Operand Networks: On-Chip Interconnect for ILP in Partitioned Architectures

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
An Efficient Implementation of Tree-Based Multicast Routing for Distributed Shared-Memory Multiprocessors

SPDP '96 Proceedings of the 8th IEEE Symposium on Parallel and Distributed Processing (SPDP '96)
Token coherence: decoupling performance and correctness

Proceedings of the 30th annual international symposium on Computer architecture
A Delay Model and Speculative Architecture for Pipelined Routers

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Principles and Practices of Interconnection Networks

Principles and Practices of Interconnection Networks
Exploring Virtual Network Selection Algorithms in DSM Cache Coherence Protocols

IEEE Transactions on Parallel and Distributed Systems
Implementation and Evaluation of a Dynamically Routed Processor Operand Network

NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
On-Chip Interconnection Architecture of the Tile Processor

IEEE Micro
A 5-GHz Mesh Interconnect for a Teraflops Processor

IEEE Micro
Research Challenges for On-Chip Interconnection Networks

IEEE Micro
Virtual Circuit Tree Multicasting: A Case for On-Chip Hardware Multicast Support

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Efficient unicast and multicast support for CMPs

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture

Efficient lookahead routing and header compression for multicasting in networks-on-chip

Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
On an efficient NoC multicasting scheme in support of multiple applications running on irregular sub-networks

Microprocessors & Microsystems
Exploring partitioning methods for 3D Networks-on-Chip utilizing adaptive routing model

NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Planar adaptive network-on-chip supporting deadlock-free and efficient tree-based multicast routing method

Microprocessors & Microsystems
An efficient, low-cost routing framework for convex mesh partitions to support virtualization

ACM Transactions on Embedded Computing Systems (TECS) - Special Section on Wireless Health Systems, On-Chip and Off-Chip Network Architectures
Efficient multicast schemes for 3-D Networks-on-Chip

Journal of Systems Architecture: the EUROMICRO Journal
Dual partitioning multicasting for high-performance on-chip networks

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Chip Multi-processor (CMP) architectures have become mainstream for designing processors. With a large number of cores, Networks-on-Chip (NOCs) provide a scalable communication method for CMP architectures. NOCs must be carefully designed to meet constraints of power consumption and area, and provide ultra low latencies. Existing NOCs mostly use Dimension Order Routing (DOR) to determine the route taken by a packet in unicast traffic. However, with the development of diverse applications in CMPs, one-to-many (multicast) and one-to-all (broadcast) traffic are becoming more common. Current unicast routing cannot support multicast and broadcast traffic efficiently. In this paper, we propose Recursive Partitioning Multicast (RPM) routing and a detailed multicast wormhole router design for NOCs. RPM allows routers to select intermediate replication nodes based on the global distribution of destination nodes. This provides more path diversities, thus achieves more bandwidth-efficiency and finally improves the performance of the whole network. Our simulation results using a detailed cycle-accurate simulator show that compared with the most recent multicast scheme, RPM saves 25% of crossbar and link power, and 33% of link utilization with 50% network performance improvement. Also RPM is more scalable to large networks than the recently proposed VCTM.