Energy-efficient cache design using variable-strength error-correcting codes

  • Authors:
  • Alaa R. Alameldeen;Ilya Wagner;Zeshan Chishti;Wei Wu;Chris Wilkerson;Shih-Lien Lu

  • Affiliations:
  • Intel Corporation, Hillsboro, OR, USA;Intel Corporation, Hillsboro, OR, USA;Intel Corporation, Hillsboro, OR, USA;Intel Corporation, Hillsboro, OR, USA;Intel Corporation, Hillsboro, OR, USA;Intel Corporation, Hillsboro, OR, USA

  • Venue:
  • Proceedings of the 38th annual international symposium on Computer architecture
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

Voltage scaling is one of the most effective mechanisms to improve microprocessors' energy efficiency. However, processors cannot operate reliably below a minimum voltage, Vccmin, since hardware structures may fail. Cell failures in large memory arrays (e.g., caches) typically determine Vccmin for the whole processor. We observe that most cache lines exhibit zero or one failures at low voltages. However, a few lines, especially in large caches, exhibit multi-bit failures and increase Vccmin. Previous solutions either significantly reduce cache capacity to enable uniform error correction across all lines, or significantly increase latency and bandwidth overheads when amortizing the cost of error-correcting codes (ECC) over large lines. In this paper, we propose a novel cache architecture that uses variable-strength error-correcting codes (VS-ECC). In the common case, lines with zero or one failures use a simple and fast ECC. A small number of lines with multi-bit failures use a strong multi-bit ECC that requires some additional area and latency. We present a novel dynamic cache characterization mechanism to determine which lines will exhibit multi-bit failures. In particular, we use multi-bit correction to protect a fraction of the cache after switching to low voltage, while dynamically testing the remaining lines for multi-bit failures. Compared to prior multi-bit-correcting proposals, VS-ECC significantly reduces power and energy, avoids significant reductions in cache capacity, incurs little area overhead, and avoids large increases in latency and bandwidth.