Characterization and cost-efficient selection of NoC topologies for general purpose CMPs

Authors:
Marta Ortín;Alexandra Ferrerón;Jorge Albericio;Darío Suárez;María Villarroya-Gaudó;Cruz Izu;Víctor Viñals
Affiliations:
U. of Zaragoza, Spain;U. of Zaragoza, Spain;U. of Zaragoza, Spain;U. of Zaragoza, Spain;U. of Zaragoza, Spain;U. of Adelaide, Australia;U. of Zaragoza, Spain
Venue:
Proceedings of the 2013 Interconnection Network Architecture: On-Chip, Multi-Chip
Year:
2013

Citing 7
Cited 0

Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling

Proceedings of the 32nd annual international symposium on Computer Architecture
Exploring concentration and channel slicing in on-chip network router

NOCS '09 Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip
An analysis of on-chip interconnection networks for large-scale chip multiprocessors

ACM Transactions on Architecture and Code Optimization (TACO)
A case for heterogeneous on-chip interconnects for CMPs

Proceedings of the 38th annual international symposium on Computer architecture
Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
CCNoC: Specializing On-Chip Interconnects for Energy Efficiency in Cache-Coherent Servers

NOCS '12 Proceedings of the 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip
A case for random shortcut topologies for HPC interconnects

Proceedings of the 39th Annual International Symposium on Computer Architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

The importance of the interconnection network is growing as the number of cores integrated on a chip increases. Communication among nodes becomes a bottleneck and impacts system performance and power consumption. This work targets general purpose CMPs, where there is a rising concern about finding low-power alternatives. We explore the implications of the interconnect choice on overall performance by comparing the behaviour of three topologies: ring, mesh, and torus. We also evaluate two additional ring configurations (one with increased bandwidth and another with reduced-pipeline routers) and concentrated versions of the topologies. Running full-system simulations allows us to carefully model the processors, memory hierarchy, and interconnection network, and execute realistic parallel and multiprogrammed workloads. We determine that the network diameter is critical for system performance and that a concentrated mesh offers the best area-energy-delay tradeoff for both 16 and 64-core chips. Traffic is very light and highly unbalanced, asserting the need for an heterogeneous network with more resources located in specific areas.