Balancing DRAM locality and parallelism in shared memory CMP systems

  • Authors:
  • Min Kyu Jeong;Doe Hyun Yoon;Dam Sunwoo;Mike Sullivan;Ikhwan Lee;Mattan Erez

  • Affiliations:
  • Dept. of Electrical and Computer Engineering, The University of Texas at Austin;Intelligent Infrastructure Lab, Hewlett-Packard Labs;ARM Inc;Dept. of Electrical and Computer Engineering, The University of Texas at Austin;Dept. of Electrical and Computer Engineering, The University of Texas at Austin;Dept. of Electrical and Computer Engineering, The University of Texas at Austin

  • Venue:
  • HPCA '12 Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Modern memory systems rely on spatial locality to provide high bandwidth while minimizing memory device power and cost. The trend of increasing the number of cores that share memory, however, decreases apparent spatial locality because access streams from independent threads are interleaved. Memory access scheduling recovers only a fraction of the original locality because of buffering limits. We investigate new techniques to reduce inter-thread access interference. We propose to partition the internal memory banks between cores to isolate their access streams and eliminate locality interference. We implement this by extending the physical frame allocation algorithm of the OS such that physical frames mapped to the same DRAM bank can be exclusively allocated to a single thread. We compensate for the reduced bank-level parallelism of each thread by employing memory sub-ranking to effectively increase the number of independent banks. This combined approach, unlike memory bank partitioning or sub-ranking alone, simultaneously increases overall performance and significantly reduces memory power consumption.