D-MRAM cache: enhancing energy efficiency with 3T-1MTJ DRAM/MRAM hybrid memory

Authors:
Hiroki Noguchi;Kumiko Nomura;Keiko Abe;Shinobu Fujita;Eishi Arima;Kyundong Kim;Takashi Nakada;Shinobu Miwa;Hiroshi Nakamura
Affiliations:
Toshiba Corporate R&D Center, Kawasaki, Japan;Toshiba Corporate R&D Center, Kawasaki, Japan;Toshiba Corporate R&D Center, Kawasaki, Japan;Toshiba Corporate R&D Center, Kawasaki, Japan;The University of Tokyo, Tokyo, Japan;The University of Tokyo, Tokyo, Japan;The University of Tokyo, Tokyo, Japan;The University of Tokyo, Tokyo, Japan;The University of Tokyo, Tokyo, Japan
Venue:
Proceedings of the Conference on Design, Automation and Test in Europe
Year:
2013

Citing 14
Cited 0

Cache decay: exploiting generational behavior to reduce cache leakage power

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Drowsy caches: simple techniques for reducing leakage power

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Hybrid cache architecture with disparate memory technologies

Proceedings of the 36th annual international symposium on Computer architecture
Energy reduction for STT-RAM using early write termination

Proceedings of the 2009 International Conference on Computer-Aided Design
Energy-performance tradeoffs in processor architecture and circuit design: a marginal cost analysis

Proceedings of the 37th annual international symposium on Computer architecture
Resistive computation: avoiding the power wall with low-leakage, STT-MRAM based computing

Proceedings of the 37th annual international symposium on Computer architecture
Dark silicon and the end of multicore scaling

Proceedings of the 38th annual international symposium on Computer architecture
Relaxing non-volatility for fast and energy-efficient STT-RAM caches

HPCA '11 Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture
High-endurance and performance-efficient design of hybrid cache architectures through adaptive line replacement

Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design
The gem5 simulator

ACM SIGARCH Computer Architecture News
Multi retention level STT-RAM cache designs with a dynamic refresh scheme

Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Cache revive: architecting volatile STT-RAM caches for enhanced performance in CMPs
Improving energy efficiency of write-asymmetric memories by log style write

Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
Full-system analysis and characterization of interactive smartphone applications

IISWC '11 Proceedings of the 2011 IEEE International Symposium on Workload Characterization

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a proposal of non-volatile cache architecture utilizing novel DRAM/MRAM cell-level hybrid structured memory (D-MRAM) that enables effective power reduction for high performance mobile SoCs without area overhead. Here, the key point to reduce active power is intermittent refresh process for the DRAM-mode. D-MRAM has advantage to reduce static power consumptions compared to the conventional SRAM, because there are no static leakage paths in the D-MRAM cell and it is not needed to supply voltage to its cells when used as the MRAM-mode. Besides, with advanced perpendicular magnetic tunnel junctions (p-MTJ), which decreases the write energy and latency without shortening its retention time, D-MRAM is capable of power reduction by replacing the traditional SRAM caches. Considering the 65-nm CMOS technology, the access latencies of 1MB memory macro are 2.2 ns/1.5 ns for read/write in DRAM mode, and 2.2 ns/4.5 ns in MRAM mode, while those of SRAM are 1.17 ns. The SPEC CPU2006 benchmarks have revealed that the energy per instruction (EPI) of the total cache memory can be dramatically reduced by 71% on average, and the instruction per cycle (IPC) performance of the D-MRAM cache architecture degraded only by approximately 4 % on average in spite of its latency overhead.