Error-control coding for computer systems
Error-control coding for computer systems
Trading off Cache Capacity for Reliability to Enable Low Voltage Operation
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
IBM Journal of Research and Development
Parichute: Generalized Turbocode-Based Error Correction for Near-Threshold Caches
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Soft error benchmarking of L2 caches with PARMA
Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Computing infrastructure for big data processing
Frontiers of Computer Science: Selected Publications from Chinese Universities
Hi-index | 0.00 |
The increasing power consumption of processors has made power reduction a first-order priority in processor design. Voltage scaling is one of the most powerful power-reduction techniques introduced to date, but is limited to some minimum voltage VDDMIN. Below VDDMIN on-chip SRAM cells cannot all operate reliably due to increased process variability with technology scaling. The use of larger SRAM cells, which are less sensitive to process variability, allows a reduction in VDDMIN. However, since the large-scale memory structures such as last-level caches (LLCs) often determine the VDDMIN of processors, these structures cannot afford to use large SRAM cells due to the resulting increase in die area. In this paper we first propose a joint optimization of LLC cell size, the number of redundant cells, and the strength of error-correction coding (ECC) to minimize total SRAM area while meeting yield and targets. The joint use of redundant cells and ECC enables the use of smaller cell sizes while maintaining design targets. Smaller cell sizes more than make up for the extra cells required by redundancy and ECC. In 32-nm technology our joint approach yields a 27% reduction in total SRAM area (including the extra cells) when targeting 90% yield and 600 mV VDDMIN. Second, we demonstrate that the ECC used to repair defective cells can be combined with a simple architectural technique, which can also fix particle-induced soft errors, without increasing ECC strength or processor runtime.