A Performance Comparison of Hierarchical Ring- and Mesh- Connected Multiprocessor Networks

Authors:
Govindan Ravindran;Michael Stumm
Affiliations:
-;-
Venue:
HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
Year:
1997

Citing 0
Cited 14

Design and implementation of the NUMAchine multiprocessor

DAC '98 Proceedings of the 35th annual Design Automation Conference
Optimal Clustering of Hierarchical Hyper-Ring Multicomputers

The Journal of Supercomputing
Hierarchical Ring Network Configuration and Performance Modeling

IEEE Transactions on Computers
Performance and Configuration of Hierarchical Ring Networks for Multiprocessors

ICPP '97 Proceedings of the international Conference on Parallel Processing
The NUMAchine Multiprocessor

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
A Tree Based Router Search Engine Architecture with Single Port Memories

Proceedings of the 32nd annual international symposium on Computer Architecture
Torus Ring: improving performance of interconnection network by modifying hierarchical ring

Parallel Computing
A Hybrid Ring/Mesh Interconnect for Network-on-Chip Using Hierarchical Rings for Global Routing

NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
Exploiting access semantics and program behavior to reduce snoop power in chip multiprocessors

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Modeling and evaluation of ring-based interconnects for Network-on-Chip

Journal of Systems Architecture: the EUROMICRO Journal
Hierarchical circuit-switched NoC for multicore video processing

Microprocessors & Microsystems
Design and evaluation of low latency interconnection networks for real-time many-core embedded systems

Computers and Electrical Engineering
On-chip ring network designs for hard-real time systems

Proceedings of the 21st International conference on Real-Time Networks and Systems
TornadoNoC: A lightweight and scalable on-chip network architecture for the many-core era

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper compares the performance of hierarchical ring- and mesh-connected wormhole routed shared memory multiprocessor networks in a simulation study. Hierarchical rings are interesting alternatives to meshes since i) they can be clocked at faster rates, ii) they can have wider data paths and hence shorter message sizes, iii) they allow addition and removal of processing nodes at arbitrary locations, iv) their topology allows natural exploitation in the spatial locality of application memory access patterns, and v) their topology allows efficient implementation of broadcasts. Our study shows that for workloads with little locality, meshes scale better than ring networks because ring-based systems have limited bisection bandwidth. However, for workloads with some memory access locality, hierarchical rings outperform meshes by 20-40% for system sizes of up to 128 processors. Even with poor access locality, hierarchical rings will outperform meshes for these system sizes if the mesh router buffers are only 1-flit large, and they will outperform meshes in systems with less than 36 processors regardless of mesh router buffer size.