Performance Evaluation of Hierarchical Ring-Based Shared Memory Multiprocessors

Authors:
M. Holliday;M. Stumm
Affiliations:
-;-
Venue:
IEEE Transactions on Computers
Year:
1994

Citing 21
Cited 9

Simulating computer systems: techniques and tools

Simulating computer systems: techniques and tools
Distributing Hot-Spot Addressing in Large-Scale Multiprocessors

IEEE Transactions on Computers
Multicomputer networks: message-based parallel processing

Multicomputer networks: message-based parallel processing
Performance analysis of hierarchical cache-consistent multiprocessors

Performance Evaluation - Selected papers from the international seminar on performance of distributed and parallel systems
Hector: A Hierarchically Structured Shared-Memory Multiprocessor

Computer - Special issue on experimental research in computer architecture
Synchronization without contention

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Comparative evaluation of latency reducing and tolerating techniques

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Comparison of hardware and software cache coherence schemes

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Ultracomputers: a teraflop before its time

Communications of the ACM
How DEC developed Alpha

IEEE Spectrum
A performance study of memory consistency models

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Hiding memory latency using dynamic scheduling in shared-memory multiprocessors

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Improved multithreading techniques for hiding communication latency in multiprocessors

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Performance of the SCI ring

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Reducing memory latency via non-blocking and prefetching caches

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
An effective synchronization network for hot-spot accesses

ACM Transactions on Computer Systems (TOCS)
Cache consistency in hierarchical-ring-based multiprocessors

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
The impact of synchronization and granularity on parallel systems

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The Scalable Coherent Interface and Related Standards Projects

IEEE Micro
Limits on Interconnection Network Performance

IEEE Transactions on Parallel and Distributed Systems
Performance Tradeoffs in Multithreaded Processors

IEEE Transactions on Parallel and Distributed Systems

Predicting application behavior in large scale shared-memory multiprocessors

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Optimal Clustering of Hierarchical Hyper-Ring Multicomputers

The Journal of Supercomputing
Performance of the hyper-ring multicomputer

SAC '98 Proceedings of the 1998 ACM symposium on Applied Computing
Hierarchical Ring Network Configuration and Performance Modeling

IEEE Transactions on Computers
Performance Modeling of Hierarchical Crossbar-Based Multicomputer Systems

IEEE Transactions on Computers
Performance and Configuration of Hierarchical Ring Networks for Multiprocessors

ICPP '97 Proceedings of the international Conference on Parallel Processing
Bidirectional versus Unidirectional Networks: Cost/Performance Trade-Offs

MASCOTS '95 Proceedings of the 3rd International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems
Torus Ring: improving performance of interconnection network by modifying hierarchical ring

Parallel Computing
TornadoNoC: A lightweight and scalable on-chip network architecture for the many-core era

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	14.99

Visualization

Abstract

Investigates the performance of word-packet, slotted unidirectional ring-based hierarchical direct networks in the context of large-scale shared memory multiprocessors. Slotted unidirectional rings are attractive because their electrical characteristics and simple interfaces allow for fast cycle times and large bandwidths. For large-scale systems, it is necessary to use multiple rings for increased aggregate bandwidth. Hierarchies are attractive because the topology ensures unique paths between nodes, simple node interfaces and simple inter-ring connections. To ensure that a realistic region of the design space is examined, the architecture of the network used in the Hector prototype is adopted as the initial design point. A simulator of that architecture has been developed and validated with measurements from the prototype. The system and workload parameterization reflects conditions expected in the near future. The results of this study shows the importance of system balance on performance.