Short-Packet Transfer Performance in Local Area Ring Networks
IEEE Transactions on Computers
Hierarchical cache/bus architecture for shared memory multiprocessors
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
Modeling a Slotted Ring Local Area Network
IEEE Transactions on Computers
Approximate Analysis of Single and Multiple Ring Networks
IEEE Transactions on Computers
Paradigm: A Highly Scalable Shared-Memory Multicomputer Architecture
Computer - Special issue on cryptography
Hector: A Hierarchically Structured Shared-Memory Multiprocessor
Computer - Special issue on experimental research in computer architecture
Scalable cache consistency for hierarchically structured multiprocessors
The Journal of Supercomputing
Comparative Modeling and Evaluation of CC-NUMA and COMA on Hierarchical Ring Architectures
IEEE Transactions on Parallel and Distributed Systems
Performance Evaluation of Hierarchical Ring-Based Shared Memory Multiprocessors
IEEE Transactions on Computers
A Performance Comparison of Hierarchical Ring- and Mesh- Connected Multiprocessor Networks
HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
On some architectural issues of optical hierarchical ring networks for shared-memory multiprocessors
MPPOI '95 Proceedings of the Second Workshop on Massively Parallel Processing Using Optical Interconnections
Comparison of Mesh and Hierarchical Networks for Multiprocessors
ICPP '94 Proceedings of the 1994 International Conference on Parallel Processing - Volume 01
Hierarchical Ring Network Configuration and Performance Modeling
IEEE Transactions on Computers
Hi-index | 0.00 |
Analytical queueing network models for expected message delay in %level and S-level hierarchical-ring interconnection networks (INS) are developed. Such networks have recently been used in commercial and research prototype multiprocessors. A major class of traffic carried by these INS consists of cache line transfers, and associated coherency control messages, between processor caches and remote memory modules in shared-memory multiprocessors. Memory modules are assumed to be evenly distributed over the processor nodes. Such traffic consists of short, fixed-length messages. They can be conveniently transported using the slotted ring transmission technique, which is studied here. The message delay results derived from the models are shown to be quite accurate when checked against a simulation study. The comparisons to simulations include heavy traffic situations where queueing delays in ring crossover switches are significant for ring utilization levels of 80 to 90%. As well as facilitating analysis, the analytical models can be used to determine optimal sizes for the rings at different levels in the hierarchy under specified traffic distributions in a system with a given total number of processor nodes. Optimality is in terms of minimizing average message delay. A specific example of such a design exercise is provided for the uniform traffic case.