DIVA: a reliable substrate for deep submicron microarchitecture design
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Transient fault detection via simultaneous multithreading
Proceedings of the 27th annual international symposium on Computer architecture
Cost reduction and evaluation of temporary faults detecting technique
DATE '00 Proceedings of the conference on Design, automation and test in Europe
Dual use of superscalar datapath for transient-fault detection and recovery
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
DSN '00 Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8)
Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors
FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
Transient-fault recovery for chip multiprocessors
Proceedings of the 30th annual international symposium on Computer architecture
Efficient Resource Sharing in Concurrent Error Detecting Superscalar Microarchitectures
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Soft Errors in Advanced Computer Systems
IEEE Design & Test
Toward Hardware-Redundant, Fault-Tolerant Logic for Nanoelectronics
IEEE Design & Test
CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
Mitigating the impact of hardware defects on multimedia applications: a cross-layer approach
MM '08 Proceedings of the 16th ACM international conference on Multimedia
Computer Architecture, Fifth Edition: A Quantitative Approach
Computer Architecture, Fifth Edition: A Quantitative Approach
Hi-index | 0.00 |
As microprocessors continue to evolve and grow in functionality, the use of smaller nanometer technology scaling coupled with high clock frequencies and exponentially increasing transistor counts dramatically increases the susceptibility of transient faults. However, the correct and reliable operation of these processors is often compulsory, both in terms of consumer experience and for high-risk embedded domains such as medical and transportation systems. Thus, economical fault detection and recovery becomes essential to meet all necessary market requirements. This paper explores the efficient leveraging of superscalar, out-of-order architectures to enable multi-cycle transient fault-tolerance throughout the datapath in a novel manner. By using dynamic instruction execution redundancy, soft errors within the datapath are both detected and recovered. The proposed microarchitecture selectively reevaluates corrupted instructions, reducing the recovery impact by preserving completed instructions unaffected by the fault. The additional computational workload is dynamically staggered to leverage the out-of-order nature of the architecture and minimize resource conflicts and delays.