Cost-effectively offering private buffers in SoCs and CMPs

  • Authors:
  • Zhen Fang;Li Zhao;Ravishankar R. Iyer;Carlos Flores Fajardo;German Fabila Garcia;Seung Eun Lee;Bin Li;Steve R. King;Xiaowei Jiang;Srihari Makineni

  • Affiliations:
  • Intel Labs, Hillsboro, OR, USA;Intel Labs, Hillsboro, OR, USA;Intel Labs, Hillsboro, OR, USA;Intel Labs, Guadalajara, Mexico;Intel, Guadalajara, Mexico;Seoul National University of Science and Technology, Seoul, South Korea;Intel Labs, Hillsboro, OR, USA;Intel Labs, Hillsboro, OR, USA;Intel Labs, Hillsboro, OR, USA;Intel Labs, Hillsboro, OR, USA

  • Venue:
  • Proceedings of the international conference on Supercomputing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

High performance SoCs and CMPs integrate multiple cores and hardware accelerators such as network interface devices and speech recognition engines. Cores make use of SRAM organized as a cache. Accelerators make use of SRAM as special-purpose storage such as FIFOs, scratchpad memory, or other forms of private buffers. Dedicated private buffers provide benefits such as deterministic access, but are highly area inefficient due to the lower average utilization of the total available storage. We propose Buffer-integrated-Caching (BiC), which integrates private buffers and traditional caches into a single shared SRAM block. Much like shared caches improve SRAM utilization on CMPs, the BiC architecture generalizes this advantage for a heterogeneous mix of cores and accelerators in future SoCs and CMPs. We demonstrate cost-effectiveness of the BiC using SoC-based low-power servers and CMP-based servers with on-chip NIC. We show that with a small extra area added to the baseline cache, BiC removes the need for large, dedicated SRAMs, with minimal performance impact.