The Performance of the Cedar Multistage Switching Network

Authors:
Josep Torrellas;Zheng Zhang
Affiliations:
Univ. of Illinois at Urbana-Champaign, Urbana;Univ. of Illinois at Urbana-Champaign, Urbana
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
1997

Citing 9
Cited 5

Performance of a shared memory system for vector multiprocessors

ICS '88 Proceedings of the 2nd international conference on Supercomputing
Performance Analysis of Multibuffered Packet-Switching Networks in Multiprocessor Systems

IEEE Transactions on Computers
An analytic model of multistage interconnection networks

SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Performance analysis of finite-buffered multistage interconnection networks with a general traffic pattern

SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Accurate modelling of interconnection networks in vector supercomputers

ICS '91 Proceedings of the 5th international conference on Supercomputing
MVAMIN: mean value analysis algorithms for multistage interconnection networks

Journal of Parallel and Distributed Computing
Characterizing memory performance in vector multiprocessors

ICS '92 Proceedings of the 6th international conference on Supercomputing
The cedar system and an initial performance study

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
The J-machine multicomputer: an architectural evaluation

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture

Impact of CC-NUMA Memory Management Policies on the Application Performance of Multistage Switching Networks

IEEE Transactions on Parallel and Distributed Systems
Design and Evaluation of a Switch Cache Architecture for CC-NUMA Multiprocessors

IEEE Transactions on Computers
Design and analysis of static memory management policies for CC-NUMA Multiprocessors

Journal of Systems Architecture: the EUROMICRO Journal
Impact of Switch Design on the Application Performance of Cache-Coherent Multiprocessors

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Modelling and performance study of finite-buffered blocking multistage interconnection networks supporting natively 2-class priority routing traffic

Journal of Network and Computer Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

While multistage switching networks for vector multiprocessors have been studied extensively, detailed evaluations of their performance are rare. Indeed, analytical models, simulations with pseudosynthetic loads, studies focused on average-value parameters, and measurements of networks disconnected from the machine, all provide limited information. In this paper, instead, we present an in-depth empirical analysis of a multistage switching network in a realistic setting: We use hardware probes to examine the performance of the omega network of the Cedar shared-memory machine executing real applications. The machine is configured with 16 vector processors.The analysis suggests that the performance of multistage switching networks is limited by traffic nonuniformities. We identify two major nonuniformities that degrade Cedar's performance and are likely to slow down other networks too. The first one is the contention caused by the return messages in a vector access as they converge from the memories to one processor port. This traffic convergence penalizes vector reads and, more importantly, causes tree saturation. The second nonuniformity is the uneven contention delays induced by a relatively fair scheme to resolve message collisions. Based on our observations, we argue that intuitive optimizations for multistage switching networks may not be the most cost-effective ones. Instead, we suggest changes to increase the network bandwidth at the root of the traffic convergence tree and to delay traffic convergence up until the final stages of the network.