Architecture of the VPP500 parallel supercomputer

Authors:
Teruo Utsumi;Masayuki Ikeda;Moriyuki Takamura
Affiliations:
Fujitsu Limited, 1015 Kamikodanaka, Nakahara-ku, Kawasaki 211, JAPAN;Fujitsu Limited 1015 Kamikodanaka, Nakahara-ku, Kawasaki 211, JAPAN;Fujitsu Limited 1015 Kamikodanaka, Nakahara-ku, Kawasaki 211, JAPAN
Venue:
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Year:
1994

Citing 2
Cited 7

Ultracomputers: a teraflop before its time

Communications of the ACM
A high performance linear equation solver on the VPP500 parallel supercomputer

Proceedings of the 1994 ACM/IEEE conference on Supercomputing

Scalar processor of the VPP500 parallel supercomputer

ICS '95 Proceedings of the 9th international conference on Supercomputing
Synchronization hardware for networks of workstations: performance vs. cost

ICS '96 Proceedings of the 10th international conference on Supercomputing
Efficient synchronization: let them eat QOLB

Proceedings of the 24th annual international symposium on Computer architecture
Out-of-order vector architectures

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Efficient conditional operations for data-parallel architectures

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
A high performance linear equation solver on the VPP500 parallel supercomputer

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Implementing virtual memory in a vector processor with software restart markers

Proceedings of the 20th annual international conference on Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The VPP500 vector parallel processor is a highly parallel, distributed memory supercomputer that has a performance range of 6.4 to 355 gigaFLOPS and a main memory capacity from 1 to 222 gigabytes. The system scalably supports between 4 and 222 processors interconnected by a high-bandwidth crossbar network.Three key aspects of the VPP500, which are in sharp contrast to current massively parallel systems, characterize its architecture. First the building block is a 1.6 gigaFLOPS vector processor that is more than an order of magnitude faster than the processors used in massively parallel processors (MPP). This high uniprocessor performance reduces the dependence on parallelism. Second the distributed memory architecture and high-bandwidth crossbar network eliminate many of the bottlenecks found in MPP systems. These allow efficient utilization of hardware and have the effect of lessening the complexity of programming parallel computers. Third the system realizes high throughput by its capability to arbitrarily partition the processing elements for flexible multiprocessing.