Architecture and Performance of the Hitachi SR2201 Massively Parallel Processor System

  • Authors:
  • Hiroaki Fujii;Yoshiko Yasuda;Hideya Akashi;Yasuhiro Inagami;Makoto Koga;Osamu Ishihara;Masamori Kashiyama;Hideo Wada;Tsutomu Sumimoto

  • Affiliations:
  • -;-;-;-;-;-;-;-;-

  • Venue:
  • IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
  • Year:
  • 1997

Quantified Score

Hi-index 0.00

Visualization

Abstract

RISC-based Massively Parallel Processors (MPPs) often show low efficiency in real-world applications because of cache miss penalty, insufficient throughput of the memory system, and poor inter-processor communication performance. Hitachi's SR2201, an MPP scalable up to 2048 processors and 600 GFLOPS peak performance, overcomes these problems by introducing three novel features. First, its processor, the 150 MHz HARP-1E, solves the cache miss penalty by "pseudo vector processing" (PVP). In PVP, data is loaded by prefetching to a special register bank, bypassing the cache. Second, a multi-bank memory architecture that operates like a pipeline eliminates the memory system bottleneck. Third, the inter-processor communication achieves high performance on the three-dimensional crossbar network, using a "remote DMA transfer" protocol and a hardware-based cache coherency. As the result of these improvements, the SR2201 achieved 220.4 GFLOPS with 1024 processors in the LINPACK benchmark, which is almost 72% of the peak performance.