On the effectiveness of residue code checking for parallel two's complement multipliers
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
DIVA: a reliable substrate for deep submicron microarchitecture design
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Transient-fault recovery using simultaneous multithreading
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Detailed design and evaluation of redundant multithreading alternatives
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Dual use of superscalar datapath for transient-fault detection and recovery
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Concurrent Error Detection Using Watchdog Processors-A Survey
IEEE Transactions on Computers
Reliable Floating-Point Arithmetic Algorithms for Error-Coded Operands
IEEE Transactions on Computers
A 1.3GHz fifth generation SPARC64 microprocessor
Proceedings of the 40th annual Design Automation Conference
AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors
FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
WHICH CONCURRENT ERROR DETECTION SCHEME TO CHOOSE?
ITC '00 Proceedings of the 2000 IEEE International Test Conference
Transient-fault recovery for chip multiprocessors
Proceedings of the 30th annual international symposium on Computer architecture
Rambist builder: a methodology for automatic built-in self-test design of embedded rams
MTDT '96 Proceedings of the 1996 IEEE International Workshop on Memory Technology, Design and Testing (MTDT '96)
Failure Factor Based Yield Enhancement for SRAM Designs
DFT '04 Proceedings of the Defect and Fault Tolerance in VLSI Systems, 19th IEEE International Symposium
Efficient Resource Sharing in Concurrent Error Detecting Superscalar Microarchitectures
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
SWIFT: Software Implemented Fault Tolerance
Proceedings of the international symposium on Code generation and optimization
Design and Evaluation of Hybrid Fault-Detection Systems
Proceedings of the 32nd annual international symposium on Computer Architecture
Opportunistic Transient-Fault Detection
Proceedings of the 32nd annual international symposium on Computer Architecture
Microarchitecture-Based Introspection: A Technique for Transient-Fault Tolerance in Microprocessors
DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
An Optimized DFT and Test Pattern Generation Strategy for an Intel High Performance Microprocessor
ITC '04 Proceedings of the International Test Conference on International Test Conference
Trends in manufacturing test methods and their implications
ITC '04 Proceedings of the International Test Conference on International Test Conference
Ultra low-cost defect protection for microprocessor pipelines
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Error Detection Using Dynamic Dataflow Verification
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Arithmetic Error Codes: Cost and Effectiveness Studies for Application in Digital System Design
IEEE Transactions on Computers
Argus: Low-Cost, Comprehensive Error Detection in Simple Cores
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Exploiting microarchitecture insights for efficient fault tolerance
Exploiting microarchitecture insights for efficient fault tolerance
A Low Cost Scheme for Reducing Silent Data Corruption in Large Arithmetic Circuits
DFT '08 Proceedings of the 2008 IEEE International Symposium on Defect and Fault Tolerance of VLSI Systems
On-Line Failure Detection and Confinement in Caches
IOLTS '08 Proceedings of the 2008 14th IEEE International On-Line Testing Symposium
IBM S/390 parallel enterprise server G5 fault tolerance: a historical perspective
IBM Journal of Research and Development
RAS strategy for IBM S/390 G5 and G6
IBM Journal of Research and Development
A survey of checker architectures
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
While Moore's Law predicts the ability of semi-conductor industry to engineer smaller and more efficient transistors and circuits, there are serious issues not contemplated in that law. One concern is the verification effort of modern computing systems, which has grown to dominate the cost of system design. On the other hand, technology scaling leads to burn-in phase out. As a result, in-the-field error rate may increase due to both actual errors and latent defects. Whereas data can be protected with arithmetic codes (like parity or ECC), there is a lack of cost-effective mechanisms for control logic. This paper presents a light-weight microarchitectural mechanism that ensures that data consumed through registers are correct. Microarchitecture presents a new way to manage reliability and testing without significantly sacrificing cost and performance, offering a unique opportunity to detect errors in the field at low cost. Our results show a coverage around 90% for the targeted structures with a cost in power and area of about 4%. The structures protected include the issue queue logic and the data associated (i.e., tags, control signals), input multiplexors, rename data, replay logic, register free list, bypasses data and logic, MOB data and addresses, register file logic, register file storage and functional units.