Preventing PCM banks from seizing too much power

  • Authors:
  • Andrew Hay;Karin Strauss;Timothy Sherwood;Gabriel H. Loh;Doug Burger

  • Affiliations:
  • University of Auckland, Auckland, NZ;Microsoft Research, Microsoft, Inc., Redmond, WA and University of Washington, Seattle, WA;University of California, Santa Barbara, CA;Microsoft Research, Microsoft, Inc., Redmond, WA, and University of Auckland, Auckland, NZ;Microsoft Research, Microsoft, Inc., Redmond, WA, and University of Washington, Seattle, WA

  • Venue:
  • Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Widespread adoption of Phase Change Memory (PCM) requires solutions to several problems recently addressed in the literature, including limited endurance, increased write latencies, and system-level changes required to exploit non-volatility. One important difference between PCM and DRAM that has received less attention is the increased need for write power management. Writing to a PCM cell requires high current density over hundreds of nanoseconds, and hard limits on the number of simultaneous writes must be enforced to ensure correct operation, limiting write throughput and therefore overall performance. Because several wear reduction schemes only write those bits that need to be written, the amount of power required to write a cache line back to memory under such a system is now variable, which creates opportunity to reduce write power. This paper proposes policies that monitor the bits that have actually been changed over time, as opposed to simply those lines that are dirty. These polices can more effectively allocate power across the system to improve write concurrency. This method for allocating power across the memory subsystem is built on the idea of "power tokens," a transferable, but time-specific, allocation of power. The results show that with a storage overhead of 4.3% in the last-level cache, a power-aware memory system can improve the performance of multiprogrammed workloads by up to 84%.