Prefetch-Aware DRAM Controllers

  • Authors:
  • Chang Joo Lee;Onur Mutlu;Veynu Narasiman;Yale N. Patt

  • Affiliations:
  • Department of Electrical and Computer Engineering, The University of Texas at Austin, USA;Microsoft Research and Carnegie Mellon University, USA;Department of Electrical and Computer Engineering, The University of Texas at Austin, USA;Department of Electrical and Computer Engineering, The University of Texas at Austin, USA

  • Venue:
  • Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Existing DRAM controllers employ rigid, non-adaptive scheduling and buffer management policies when servicing prefetch requests. Some controllers treat prefetch requests the same as demand requests, others always prioritize demand requests over prefetch requests. However, none of these rigid policies result in the best performance because they do not take into account the usefulness of prefetch requests. If prefetch requests are useless, treating prefetches and demands equally can lead to significant performance loss and extra bandwidth consumption. In contrast, if prefetch requests are useful, prioritizing demands over prefetches can hurt performance by reducing DRAM throughput and delaying the service of useful requests. This paper proposes a new low-cost memory controller, called Prefetch-Aware DRAM Controller (PADC), that aims to maximize the benefit of useful prefetches and minimize the harm caused by useless prefetches. To accomplish this, PADC estimates the usefulness of prefetch requests and dynamically adapts its scheduling and buffer management policies based on the estimates. The key idea is to 1) adaptively prioritize between demand and prefetch requests, and 2) drop useless prefetches to free up memory system resources, based on the accuracy of the prefetcher. Our evaluation shows that PADC significantly outperforms previous memory controllers with rigid prefetch handling policies on both single- and multi-core systems with a variety of prefetching algorithms. Across a wide range of multiprogrammed SPEC CPU 2000/2006 workloads, it improves system performance by 8.2%on a 4-core system and by 9.9%on an 8-core system while reducing DRAM bandwidth consumption by 10.7% and 9.4% respectively.