Reducing off-chip memory traffic by selective cache management scheme in GPGPUs

  • Authors:
  • Hyojin Choi;Jaewoo Ahn;Wonyong Sung

  • Affiliations:
  • Seoul National University, Gwanak-ro, Gwanak-gu, Seoul, Korea;Seoul National University, Gwanak-ro, Gwanak-gu, Seoul, Korea;Seoul National University, Gwanak-ro, Gwanak-gu, Seoul, Korea

  • Venue:
  • Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The performance of General Purpose Graphics Processing Units (GPGPUs) is frequently limited by the off-chip memory bandwidth. To mitigate this bandwidth wall problem, recent GPUs are equipped with on-chip L1 and L2 caches. However, there has been little work for better utilizing on-chip shared caches in GPGPUs. In this paper, we propose two cache management schemes: write-buffering and read-bypassing. The write buffering technique tries to utilize the shared cache for inter-block communication, and thereby reduces the DRAM accesses as much as the capacity of the cache. The read-bypassing scheme prevents the shared cache from being polluted by streamed data that are consumed only within a thread-block. The proposed schemes can be selectively applied to global memory instructions using newly defined cache operators. We evaluate the effects of the proposed schemes for a few GPGPU applications by simulations. We have shown that the off-chip memory accesses can be successfully reduced by the proposed techniques. We also analyze the effectiveness of these methods when the throughput gap between cores and off-chip memory becomes wider.