Reducing DRAM row activations with eager read/write clustering

Authors:
Myeongjae Jeon;Conglong Li;Alan L. Cox;Scott Rixner
Affiliations:
Rice University, Houston, TX;Rice University, Houston, TX;Rice University, Houston, TX;Rice University, Houston, TX
Venue:
ACM Transactions on Architecture and Code Optimization (TACO)
Year:
2013

Citing 20
Cited 0

Reducing memory latency via non-blocking and prefetching caches

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Memory access scheduling

Proceedings of the 27th annual international symposium on Computer architecture
Eager writeback - a technique for improving bandwidth utilization

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Dynamic Access Ordering for Streamed Computations

IEEE Transactions on Computers
Adaptive History-Based Memory Schedulers

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Memory Controller Optimizations for Web Servers

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Improving energy efficiency by making DRAM less randomly accessed

ISLPED '05 Proceedings of the 2005 international symposium on Low power electronics and design
Memory scheduling for modern microprocessors

ACM Transactions on Computer Systems (TOCS)
COTSon: infrastructure for full system simulation

ACM SIGOPS Operating Systems Review
Mini-rank: Adaptive DRAM architecture for improving memory power efficiency

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Multicore DIMM: an Energy Efficient Memory Module with Independently Controlled DRAMs

IEEE Computer Architecture Letters
Micro-pages: increasing DRAM efficiency with locality-aware data placement

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
The virtual write queue: coordinating DRAM and last-level cache policies

Proceedings of the 37th annual international symposium on Computer architecture
Rethinking DRAM design and organization for energy-constrained multi-cores

Proceedings of the 37th annual international symposium on Computer architecture
Fine-Grained Activation for Power Reduction in DRAM

IEEE Micro
TPCC-UVa: an open-source TPC-C implementation for parallel and distributed systems

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
DRAMSim2: A Cycle Accurate Memory System Simulator

IEEE Computer Architecture Letters
Staged Reads: Mitigating the impact of DRAM writes on DRAM reads

HPCA '12 Proceedings of the 2012 IEEE 18th International Symposium on High-Performance Computer Architecture
Improving writeback efficiency with decoupled last-write prediction

Proceedings of the 39th Annual International Symposium on Computer Architecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article describes and evaluates a new approach to optimizing DRAM performance and energy consumption that is based on eagerly writing dirty cache lines to DRAM. Under this approach, many dirty cache lines are written to DRAM before they are evicted. In particular, dirty cache lines that have not been recently accessed are eagerly written to DRAM when the corresponding row has been activated by an ordinary, noneager access, such as a read. This approach enables clustering of reads and writes that target the same row, resulting in a significant reduction in row activations. Specifically, for a variety of applications, it reduces the number of DRAM row activations by an average of 42% and a maximum of 82%. Moreover, the results from a full-system simulator show compelling performance improvements and energy consumption reductions. Out of 23 applications, 6 have overall performance improvements between 10% and 20%, and 3 have improvements in excess of 20%. Furthermore, 12 consume between 10% and 20% less DRAM energy, and 7 have energy consumption reductions in excess of 20%.