Vector performance analysis of three supercomputers: Cray 2, Cray Y-MP, and ETA 10-Q
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Hardware-software trade-offs in a direct Rambus implementation of the RAMpage memory hierarchy
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
IEEE Micro
Implications of memory performance for highly efficient supercomputing of scientific applications
ISPA'06 Proceedings of the 4th international conference on Parallel and Distributed Processing and Applications
Hi-index | 0.00 |
This paper presents the results of a series of experiments to study the vector performance of the NEC SX-2. The main object of this study is to understand the architecture and identify its bottlenecks and limiting factors. A simple performance model is used to examine the impact of certain architectural features on the performance of a set of basic operations. The results of implementing this set on the machine for four vector lengths and three memory strides are presented and compared. These results show that the vector length and the ratio of floating point operations to memory references have a great impact on the performance of the machine. Two numerical algorithms are also employed and the results of these algorithms and the basic operations are compared to early results on one processor of the Cray-2 and Cray Y-MP. These comparisons show that the SX-2 is faster than the Cray Y-MP by up to 86% for short vectors and by 2 to 4 times for long vectors. Also, it outperformed the Cray-2 by even bigger factors. Finally, the architecture of the SX-X is presented, and some predictions about its performance are given.