Understanding and mitigating refresh overheads in high-density DDR4 DRAM systems

Authors:
Janani Mukundan;Hillery Hunter;Kyu-hyoun Kim;Jeffrey Stuecheli;José F. Martínez
Affiliations:
Cornell University, Ithaca, NY;IBM Thomas J. Watson, Yorktown Heights, NY;IBM Thomas J. Watson, Yorktown Heights, NY;IBM Systems and Tech. Group, Austin, TX;Cornell University, Ithaca, NY
Venue:
Proceedings of the 40th Annual International Symposium on Computer Architecture
Year:
2013

Citing 9
Cited 0

The SPLASH-2 programs: characterization and methodological considerations

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Memory access scheduling

Proceedings of the 27th annual international symposium on Computer architecture
Quantitative performance analysis of the SPEC OMPM2001 benchmarks

Scientific Programming - OpenMP
Smart Refresh: An Enhanced Memory Controller Design for Reducing Energy in Conventional and 3D Die-Stacked DRAMs

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Elastic Refresh: Techniques to Mitigate Refresh Penalties in High Density Memory

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Flikker: saving DRAM refresh-power through critical data partitioning

Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
DRAMSim2: A Cycle Accurate Memory System Simulator

IEEE Computer Architecture Letters
IBM POWER7 multicore server processor

IBM Journal of Research and Development
RAIDR: Retention-Aware Intelligent DRAM Refresh

Proceedings of the 39th Annual International Symposium on Computer Architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent DRAM specifications exhibit increasing refresh latencies. A refresh command blocks a full rank, decreasing available parallelism in the memory subsystem significantly, thus decreasing performance. Fine Granularity Refresh (FGR) is a feature recently announced as part of JEDEC's DDR4 DRAM specification that attempts to tackle this problem by creating a range of refresh options that provide a trade-off between refresh latency and frequency. In this paper, we first conduct an analysis of DDR4 DRAM's FGR feature, and show that there is no one-size-fits-all option across a variety of applications. We then present Adaptive Refresh (AR), a simple yet effective mechanism that dynamically chooses the best FGR mode for each application and phase within the application. When looking at the refresh problem more closely, we identify in high-density DRAM systems a phenomenon that we call command queue seizure, whereby the memory controller's command queue seizes up temporarily because it is full with commands to a rank that is being refreshed. To attack this problem, we propose two complementary mechanisms called Delayed Command Expansion (DCE) and Preemptive Command Drain (PCD). Our results show that AR does exploit DDR4's FGR effectively. However, once our proposed DCE and PCD mechanisms are added, DDR4's FGR becomes redundant in most cases, except in a few highly memory-sensitive applications, where the use of AR does provide some additional benefit. In all, our simulations show that the proposed mechanisms yield 8% (14%) mean speedup with respect to traditional refresh, at normal (extended) DRAM operating temperatures, for a set of diverse parallel applications.