Design and Test of Large Embedded Memories: An Overview
IEEE Design & Test
Power efficiency of voltage scaling in multiple clock, multiple voltage cores
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided design
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Dynamic frequency and voltage control for a multiple clock domain microarchitecture
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Power Efficient Data Cache Designs
ICCD '03 Proceedings of the 21st International Conference on Computer Design
Single-vDD and single-vT super-drowsy techniques for low-leakage high-performance instruction caches
Proceedings of the 2004 international symposium on Low power electronics and design
A Low Leakage and SNM Free SRAM Cell Design in Deep Sub Micron CMOS Technology
VLSID '06 Proceedings of the 19th International Conference on VLSI Design held jointly with 5th International Conference on Embedded Systems Design
A 160 mV, fully differential, robust schmitt trigger based sub-threshold SRAM
ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
DSD '07 Proceedings of the 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools
Trading off Cache Capacity for Reliability to Enable Low Voltage Operation
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Error-correcting codes for semiconductor memory applications: a state-of-the-art review
IBM Journal of Research and Development
Necromancer: enhancing system throughput by animating dead cores
Proceedings of the 37th annual international symposium on Computer architecture
RVC: a mechanism for time-analyzable real-time processors with faulty caches
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Energy-efficient cache design using variable-strength error-correcting codes
Proceedings of the 38th annual international symposium on Computer architecture
Matching cache access behavior and bit error pattern for high performance low Vcc L1 cache
Proceedings of the 48th Design Automation Conference
Selective word reading for high performance and low power processor
Proceedings of the 2011 ACM Symposium on Research in Applied Computation
Point and discard: a hard-error-tolerant architecture for non-volatile last level caches
Proceedings of the 49th Annual Design Automation Conference
The Performance Vulnerability of Architectural and Non-architectural Arrays to Permanent Faults
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Low-Latency Mechanisms for Near-Threshold Operation of Private Caches in Shared Memory Multicores
MICROW '12 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture Workshops
Breaking the energy barrier in fault-tolerant caches for multicore systems
Proceedings of the Conference on Design, Automation and Test in Europe
Efficient cache architectures for reliable hybrid voltage operation using EDC codes
Proceedings of the Conference on Design, Automation and Test in Europe
APPLE: adaptive performance-predictable low-energy caches for reliable hybrid voltage operation
Proceedings of the 50th Annual Design Automation Conference
Exploiting process variability in voltage/frequency control
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
Transistors per area unit double in every new technology node. However, the electric field density and power demand grow if Vcc is not scaled. Therefore, Vcc must be scaled in pace with new technology nodes to prevent excessive degradation and keep power demand within reasonable limits. Unfortunately, low Vcc operation exacerbates the effect of variations and decreases noise and stability margins, increasing the likelihood of errors in SRAM memories such as caches. Those errors translate into performance loss and performance variation across different cores, which is especially undesirable in a multi-core processor. This paper presents (i) a novel scheme to tolerate high faulty bit rates in caches by disabling only faulty subblocks, (ii) a dynamic address remapping scheme to reduce performance variation across different cores, which is key for performance predictability, and (iii) a comparison with state-of-the-art techniques for faulty bit tolerance in caches. Results for some typical first level data cache configurations show 15% average performance increase and standard deviation reduction from 3.13% down to 0.55% when compared to cache line disabling schemes.