Torus Ring: improving performance of interconnection network by modifying hierarchical ring

Authors:
Jong Wook Kwak;Chu Shik Jhon
Affiliations:
Processor Architecture Lab., SOC R&D Center, System LSI Division, Semiconductor Business, Samsung Electronics, Gyeonggi-do 446-711, Republic of Korea;Department of Electrical Engineering and Computer Science, Seoul National University, Seoul 151-742, Republic of Korea
Venue:
Parallel Computing
Year:
2007

Citing 24
Cited 1

Fat-trees: universal networks for hardware-efficient supercomputing

IEEE Transactions on Computers
Hector: A Hierarchically Structured Shared-Memory Multiprocessor

Computer - Special issue on experimental research in computer architecture
The DASH prototype: implementation and performance

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Scalable cache consistency for hierarchically structured multiprocessors

The Journal of Supercomputing
Comparative Modeling and Evaluation of CC-NUMA and COMA on Hierarchical Ring Architectures

IEEE Transactions on Parallel and Distributed Systems
The SPLASH-2 programs: characterization and methodological considerations

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
STiNG: a CC-NUMA computer system for the commercial marketplace

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Design and implementation of the NUMAchine multiprocessor

DAC '98 Proceedings of the 35th annual Design Automation Conference
Hierarchical Ring Network Configuration and Performance Modeling

IEEE Transactions on Computers
Efficient schemes to scale the interconnection network bandwidth in a ring-based multiprocessor system

Proceedings of the 2001 ACM symposium on Applied computing
Parallel Computer Architecture: A Hardware/Software Approach

Parallel Computer Architecture: A Hardware/Software Approach
Scalable Parallel Computing: Technology,Architecture,Programming

Scalable Parallel Computing: Technology,Architecture,Programming
Performance Evaluation of Hierarchical Ring-Based Shared Memory Multiprocessors

IEEE Transactions on Computers
Performance Evaluation of the Slotted Ring Multiprocessor

IEEE Transactions on Computers
Virtual-Channel Flow Control

IEEE Transactions on Parallel and Distributed Systems
Performance and Configuration of Hierarchical Ring Networks for Multiprocessors

ICPP '97 Proceedings of the international Conference on Parallel Processing
On Topology and Bisection Bandwidth of Hierarchical-ring Networks for Shared-Memory Multiprocessors.

HIPC '98 Proceedings of the Fifth International Conference on High Performance Computing
A Performance Comparison of Hierarchical Ring- and Mesh- Connected Multiprocessor Networks

HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
On some architectural issues of optical hierarchical ring networks for shared-memory multiprocessors

MPPOI '95 Proceedings of the Second Workshop on Massively Parallel Processing Using Optical Interconnections
The NUMAchine Multiprocessor

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
SPLASH: Stanford parallel applications for shared-memory

SPLASH: Stanford parallel applications for shared-memory
Performance issues in the design of hierarchical-ring and direct networks for shared-memory multiprocessors

Performance issues in the design of hierarchical-ring and direct networks for shared-memory multiprocessors
DRACO: optimized CC-NUMA system with novel dual-link interconnections to reduce the memory latency

MEDEA '04 Proceedings of the 2004 workshop on MEmory performance: DEaling with Applications , systems and architecture
Comparison of Mesh and Hierarchical Networks for Multiprocessors

ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01

On the impact of the migration topology on the Island Model

Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In multiprocessor systems, interconnection network design is critical for overall system performance. Among the popular interconnection networks, unidirectional ring-based networks have been one of popular choices for high performance large-scale shared memory multiprocessor systems. In this paper, we propose ''Torus Ring'', which is a modified version of two-level hierarchical ring. The Torus Ring has the same complexity as the hierarchical rings, and the only difference is the way it connects the local rings. Compared to hierarchical rings, the Torus Ring helps exploit the memory access locality of application programs more efficiently. It has an advantage over the hierarchical ring when the destination of a packet is the adjacent local ring, especially the backward adjacent local ring. Although we assume that the destination of a network packet is uniformly distributed across the processing nodes, the average number of hops in Torus Ring is equal to that of the hierarchical ring. However, the performance gain of the Torus Ring is expected to increase, due to the memory access locality of the application programs in the real parallel programming environment. In the simulation results, the latency of the interconnection network is reduced by up to 19% and the execution time is reduced by up to 10%, with the moderate ring utilization ratio.