Cache performance in vector supercomputers

  • Authors:
  • L. I. Kontothanassis;R. A. Sugumar;G. J. Faanes;J. E. Smith;M. L. Scott

  • Affiliations:
  • University of Rochester, Rochester, NY;Cray Research Inc., Chippewa Falls, WI;Cray Research Inc., Chippewa Falls, WI;University of Wisconsin-Madison, Madison, WI;University of Rochester, Rochester, NY

  • Venue:
  • Proceedings of the 1994 ACM/IEEE conference on Supercomputing
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditional supercomputers use a flat multi-bank SRAM memory organization to supply high bandwidth at low latency. Most other computers use a hierarchical organization with a small SRAM cache and slower, cheaper DRAM for main memory. Such systems rely heavily on data locality for achieving optimum performance. This paper evaluates cache-based memory systems for vector supercomputers. We develop a simulation model for a cache-based version of the Cray Research C90 and use the NAS parallel benchmarks to provide a large scale workload. We show that while caches reduce memory traffic and improve the performance of plain DRAM memory, they still lag behind cacheless SRAM. We identify the performance bottle-necks in DRAM-based memory systems and quantify their contribution to program performance degradation. We find the data fetch strategy to be a significant parameter affecting performance, evaluate the performance of several fetch policies, and show that small fetch sizes improve performance by maximizing the use of available memory bandwidth.