Efficient unicast and multicast support for CMPs

  • Authors:
  • Samuel Rodrigo;Jose Flich;Jose Duato;Mark Hummel

  • Affiliations:
  • Parallel Architectures Group, Technical University of Valencia, Spain;Parallel Architectures Group, Technical University of Valencia, Spain;Parallel Architectures Group, Technical University of Valencia, Spain;AMD, USA

  • Venue:
  • Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Beyond a certain number of cores, multi-core processing chips will require a network-on-chip (NoC) to interconnect the cores and overcome the limitations of a bus. NoCs must be carefully designed to meet constraints like power consumption, area, and ultra low latencies. Although 2D meshes with DOR (Dimension-Order-Routing) meet these constraints, the need for partitioning (e.g. virtual machines, coherency domains) and traffic isolation may prevent the use of DOR routing. Also, core heterogeneity and manufacturing and run-time faults may lead to partially irregular topologies. Routing in these topologies is complex, and previously proposed solutions required routing tables, which drastically increase power consumption, area, and latency. The exception is LBDR (Logic-Based Distributed Routing), a flexible routing method for irregular topologies that removes the need for using routing tables (both at end-nodes and switches), thus achieving large savings in chip area and power consumption. But LBDR lacks support for multicast and broadcast, which are required to efficiently support cache coherence protocols both for single and multiple coherence domains.