ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Proceedings of the 24th annual international symposium on Computer architecture
DIVA: a reliable substrate for deep submicron microarchitecture design
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Transient fault detection via simultaneous multithreading
Proceedings of the 27th annual international symposium on Computer architecture
Slipstream processors: improving both performance and fault tolerance
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Transient-fault recovery using simultaneous multithreading
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Detailed design and evaluation of redundant multithreading alternatives
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Dual use of superscalar datapath for transient-fault detection and recovery
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Handling long-latency loads in a simultaneous multithreading processor
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
IBM's S/390 G5 Microprocessor Design
IEEE Micro
Concurrent Error Detection Using Watchdog Processors-A Survey
IEEE Transactions on Computers
Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors
FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
Temperature-aware microarchitecture
Proceedings of the 30th annual international symposium on Computer architecture
Transient-fault recovery for chip multiprocessors
Proceedings of the 30th annual international symposium on Computer architecture
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Techniques to Reduce the Soft Error Rate of a High-Performance Microprocessor
Proceedings of the 31st annual international symposium on Computer architecture
Proceedings of the 31st annual international symposium on Computer architecture
Opportunistic Transient-Fault Detection
IEEE Micro
Self-checking instructions: reducing instruction redundancy for concurrent error detection
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
SlicK: slice-based locality exploitation for efficient redundant multithreading
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Examining ACE analysis reliability estimates using fault-injection
Proceedings of the 34th annual international symposium on Computer architecture
Mechanisms for bounding vulnerabilities of processor structures
Proceedings of the 34th annual international symposium on Computer architecture
Dynamic prediction of architectural vulnerability from microarchitectural state
Proceedings of the 34th annual international symposium on Computer architecture
Compiler-Managed Software-based Redundant Multi-Threading for Transient Fault Detection
Proceedings of the International Symposium on Code Generation and Optimization
Modeling and improving data cache reliability: 1
Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Transient fault prediction based on anomalies in processor events
Proceedings of the conference on Design, automation and test in Europe
Hierarchical Verification for Increasing Performance in Reliable Processors
Journal of Electronic Testing: Theory and Applications
Efficient fault tolerance in multi-media applications through selective instruction replication
Proceedings of the 2008 workshop on Radiation effects and fault tolerance in nanometer technologies
Exploiting selective placement for low-cost memory protection
ACM Transactions on Architecture and Code Optimization (TACO)
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Compiler-assisted soft error detection under performance and energy constraints in embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
End-to-end register data-flow continuous self-test
Proceedings of the 36th annual international symposium on Computer architecture
Architecture Design for Soft Errors
Architecture Design for Soft Errors
Selective replication: A lightweight technique for soft errors
ACM Transactions on Computer Systems (TOCS)
Shoestring: probabilistic soft error reliability on the cheap
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Cache vulnerability equations for protecting data in embedded processor caches from soft errors
Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems
Using hardware vulnerability factors to enhance AVF analysis
Proceedings of the 37th annual international symposium on Computer architecture
A cost effective approach for online error detection using invariant relationships
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Modeling soft errors for data caches and alleviating their effects on data reliability
Microprocessors & Microsystems
Detecting errors using multi-cycle invariance information
Proceedings of the Conference on Design, Automation and Test in Europe
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
On the exploitation of narrow-width values for improving register file reliability
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A fault-tolerant, dynamically scheduled pipeline structure for chip multiprocessors
SAFECOMP'11 Proceedings of the 30th international conference on Computer safety, reliability, and security
Trade-offs in transient fault recovery schemes for redundant multithreaded processors
HiPC'06 Proceedings of the 13th international conference on High Performance Computing
Reliability, thermal, and power modeling and optimization
Proceedings of the International Conference on Computer-Aided Design
Hardware dependability in the presence of soft errors
VoCS'08 Proceedings of the 2008 international conference on Visions of Computer Science: BCS International Academic Conference
A first-order mechanistic model for architectural vulnerability factor
Proceedings of the 39th Annual International Symposium on Computer Architecture
Thread vulnerability in parallel applications
Journal of Parallel and Distributed Computing
RISE: improving the streaming processors reliability against soft errors in gpgpus
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Dynamic code duplication with vulnerability awareness for soft error detection on VLIW architectures
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Proceedings of the 27th international ACM conference on International conference on supercomputing
Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Low cost control flow protection using abstract control signatures
Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Cost-effective soft-error protection for SRAM-based structures in GPGPUs
Proceedings of the ACM International Conference on Computing Frontiers
A survey of checker architectures
ACM Computing Surveys (CSUR)
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
Hi-index | 0.00 |
CMOS scaling increases susceptibility of microprocessors to transient faults. Most current proposals for transient-fault detection use full redundancy to achieve perfect coverage while incurring significant performance degradation. However, most commodity systems do not need or provide perfect coverage. A recent paper explores this leniency to reduce the soft-error rate of the issue queue during L2 misses while incurring minimal performance degradation. Whereas the previous paper reduces soft-error rate without using any redundancy, we target better coverage while incurring similarly-minimal performance degradation by opportunistically using redundancy. We propose two semi-complementary techniques, called partial explicit redundancy (PER) and implicit redundancy through reuse (IRTR), to explore the trade-off between soft-error rate and performance. PER opportunistically exploits low-ILP phases and L2 misses to introduce explicit redundancy with minimal performance degradation. Because PER covers the entire pipeline and exploits not only L2 misses but all low-ILP phases, PER achieves better coverage than the previous work. To achieve coverage in high-ILP phases as well, we propose implicit redundancy through reuse (IRTR). Previous work exploits the phenomenon of instruction reuse to avoid redundant execution while falling back on redundant execution when there is no reuse. IRTR takes reuse to the extreme of performance-coverage trade-off and completely avoids explicit redundancy by exploiting reuseýs implicit redundancy within the main thread for fault detection with virtually no performance degradation. Using simulations with SPEC2000, we show that PER and IRTR achieve better trade-off between soft-error rate and performance degradation than the previous schemes.