Cache Memory Organization to Enhance the Yield of High Performance VLSI Processors
IEEE Transactions on Computers
ACM Transactions on Computer Systems (TOCS)
Evaluating Associativity in CPU Caches
IEEE Transactions on Computers
A Unified Negative-Binomial Distribution for Yield Analysis of Defect-Tolerant Circuits
IEEE Transactions on Computers
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Approximating block accesses in database organizations
Communications of the ACM
Design Challenges of Technology Scaling
IEEE Micro
Performance Implications of Tolerating Cache Faults
IEEE Transactions on Computers
Parameter variations and impact on circuits and microarchitecture
Proceedings of the 40th annual Design Automation Conference
Architecture of a VLSI instruction cache for a RISC
ISCA '83 Proceedings of the 10th annual international symposium on Computer architecture
PADded Cache: A New Fault-Tolerance Technique for Cache Memories
VTS '99 Proceedings of the 1999 17TH IEEE VLSI Test Symposium
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
A cache-defect-aware code placement algorithm for improving the performance of processors
ICCAD '05 Proceedings of the 2005 IEEE/ACM International conference on Computer-aided design
Statistical analysis of SRAM cell stability
Proceedings of the 43rd annual Design Automation Conference
SPEC CPU2006 benchmark descriptions
ACM SIGARCH Computer Architecture News
Performance of Graceful Degradation for Cache Faults
ISVLSI '07 Proceedings of the IEEE Computer Society Annual Symposium on VLSI
ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
DSD '07 Proceedings of the 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools
L2 Cache Modeling for Scientific Applications on Chip Multi-Processors
ICPP '07 Proceedings of the 2007 International Conference on Parallel Processing
IBM Journal of Research and Development
Trading off Cache Capacity for Reliability to Enable Low Voltage Operation
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Tolerating process variations in large, set-associative caches: The buddy cache
ACM Transactions on Architecture and Code Optimization (TACO)
Circuit techniques for dynamic variation tolerance
Proceedings of the 46th Annual Design Automation Conference
CMOS design near the limit of scaling
IBM Journal of Research and Development
Power-constrained CMOS scaling limits
IBM Journal of Research and Development
ZerehCache: armoring cache architectures in high defect density technologies
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
A resilience roadmap: (invited paper)
Proceedings of the Conference on Design, Automation and Test in Europe
DEFCAM: A design and evaluation framework for defect-tolerant cache memories
ACM Transactions on Architecture and Code Optimization (TACO)
An analytical model for the calculation of the Expected Miss Ratio in faulty caches
IOLTS '11 Proceedings of the 2011 IEEE 17th International On-Line Testing Symposium
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Hi-index | 0.00 |
The traditional performance cost benefits we have enjoyed for decades from technology scaling are challenged by several critical constraints including reliability. Increases in static and dynamic variations are leading to higher probability of parametric and wear-out failures and are elevating reliability into a prime design constraint. In particular, SRAM cells used to build caches that dominate the processor area are usually minimum sized and more prone to failure. It is therefore of paramount importance to develop effective methodologies that facilitate the exploration of reliability techniques for caches. To this end, we present an analytical model that can determine for a given cache configuration, address trace, and random probability of permanent cell failure the exact expected miss rate and its standard deviation when blocks with faulty bits are disabled. What distinguishes our model is that it is fully analytical, it avoids the use of fault maps, and yet, it is both exact and simpler than previous approaches. The analytical model is used to produce the miss-rate trends (expected miss-rate) for future technology nodes for both uncorrelated and clustered faults. Some of the key findings based on the proposed model are (i) block disabling has a negligible impact on the expected miss-rate unless probability of failure is equal or greater than 2.6e-4, (ii) the fault map methodology can accurately calculate the expected miss-rate as long as 1,000 to 10,000 fault maps are used, and (iii) the expected miss-rate for execution of parallel applications increases with the number of threads and is more pronounced for a given probability of failure as compared to sequential execution.