The eDRAM based L3-Cache of the BlueGene/L Supercomputer Processor Node

  • Authors:
  • Martin Ohmacht;Dirk Hoenicke;Ruud Haring;Alan Gara

  • Affiliations:
  • IBM T. J. Watson Research Center;IBM T. J. Watson Research Center;IBM T. J. Watson Research Center;IBM T. J. Watson Research Center

  • Venue:
  • SBAC-PAD '04 Proceedings of the 16th Symposium on Computer Architecture and High Performance Computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

BlueGene/L is a supercomputer consisting of 64K dual-processor system-on-a-chip compute nodes, capable of delivering an arithmetic peak performance of 5.6Gflops per node. To match the memory speed to the high compute performance, the system implements an aggressive three-level on-chip cache hierarchy for each node. The implemented hierarchy offers high bandwidth and integrated prefetching on cache hierarchy levels 2 and 3 to reduce memory access time. The integrated L3-cache stores a total of 4MB of data, using multi-bank embedded DRAM. The 1024 bit wide data port of the embedded DRAM provides 22.4GB/s bandwidth to serve the speculative prefetching demands of the two processor cores and the Gigabit Ethernet DMA engine.