Benchmarking a vector-processor prototype based on multithreaded streaming/FIFO vector (MSFV) architecture

Authors:
Tetsuo Hironaka;Takashi Hashimoto;Keizo Okazaki;Kazuaki Murakami;Shinji Tomita
Affiliations:
-;-;-;-;-
Venue:
ICS '92 Proceedings of the 6th international conference on Supercomputing
Year:
1992

Citing 8
Cited 2

Advanced compiler optimizations for supercomputers

Communications of the ACM - Special issue on parallelism
The WM computer architecture

ACM SIGARCH Computer Architecture News
The IBM System/370 Vector Architecture: Design Considerations

IEEE Transactions on Computers
High-speed processing schemes for summation type and iteration type vector instructions on Hitachi supercomputer S-820 system

ICS '88 Proceedings of the 2nd international conference on Supercomputing
A unified vector/scalar floating-point architecture

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Micro 2000

IEEE Spectrum
Multi-threaded vectorization

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
The CRAY-1 computer system

Communications of the ACM - Special issue on computer architecture

A scalar architecture for pseudo vector processing based on slide-windowed registers

ICS '93 Proceedings of the 7th international conference on Supercomputing
A micro-vectorprocessor architecture: performance modeling and benchmarking

ICS '93 Proceedings of the 7th international conference on Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents the benchmark results on a vector-processor prototype based on the MSFV (multithreaded streaming/FIFO vector) architecture. The MSFV architecture is single-chip oriented, and thus its main object is to save the off-chip memory bandwidth by exploiting the register bandwidth instead. The register bandwidth is exploited by the synergism of FIFO register, chaining, streaming, and multithreading. This paper tries to identify the strength and weakness of those architectural features. The results for basic vector operations and Livermore Fortran Kernels are reported in terms of normalized FLOPC (floating-point operations per clock cycle) and compared to previously-reported results on the Cray X-MP, Y-MP, Fujitsu VP-200, Hitachi S-810/20, NEC SX-2, and SX-3. These comparisons show that, for many basic vector operations, the execution rate of the MSFV prototype results in worst due to its saving thememory bandwidth. However, for Livermore Fortran Kernels, the MSFV prototype results in worst due to its saving the memory bandwidth. However, for Livermore Fortran Kernels, the MSFV prototype outperforms the VP-200 by 2.11 times (geometric mean) and one processor of the X-MP by 1.22 times (geometric mean) in terms of FLOPC. Also, it is 0.67 times (geometric mean) faster than the S-810/20, and 0.76 times (geometric mean) faster than the SX-2. The paper concludes that the MSFV architecture is successful in saving the memory bandwidth.