Transient fault detection via simultaneous multithreading
Proceedings of the 27th annual international symposium on Computer architecture
Detailed design and evaluation of redundant multithreading alternatives
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
ReVive: cost-effective architectural support for rollback recovery in shared-memory multiprocessors
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
IBM's S/390 G5 Microprocessor Design
IEEE Micro
Power4 System Design for High Reliability
IEEE Micro
Impact of Deep Submicron Technology on Dependability of VLSI Circuits
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Evaluation of a 32-bit Microprocessor with Built-In Concurrent Error-Detection
FTCS '97 Proceedings of the 27th International Symposium on Fault-Tolerant Computing (FTCS '97)
The architecture of Tandem's NonStop system
ACM '81 Proceedings of the ACM '81 conference
Y-Branches: When You Come to a Fork in the Road, Take It
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Cache Scrubbing in Microprocessors: Myth or Necessity?
PRDC '04 Proceedings of the 10th IEEE Pacific Rim International Symposium on Dependable Computing (PRDC'04)
Characterization of Soft Errors Caused by Single Event Upsets in CMOS Processes
IEEE Transactions on Dependable and Secure Computing
SWIFT: Software Implemented Fault Tolerance
Proceedings of the international symposium on Code generation and optimization
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Design and Evaluation of Hybrid Fault-Detection Systems
Proceedings of the 32nd annual international symposium on Computer Architecture
ReStore: Symptom Based Soft Error Detection in Microprocessors
DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
Software-controlled fault tolerance
ACM Transactions on Architecture and Code Optimization (TACO)
Automatic Instruction-Level Software-Only Recovery
DSN '06 Proceedings of the International Conference on Dependable Systems and Networks
In-Register Duplication: Exploiting Narrow-Width Value for Improving Register File Reliability
DSN '06 Proceedings of the International Conference on Dependable Systems and Networks
The LLVM compiler framework and infrastructure tutorial
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Hi-index | 0.00 |
Dramatic increases in the number of transistors that can be integrated on a chip make processors more susceptible to radiation-induced transient errors. For commodity chips which are cost- and energy-constrained, we need a flexible and inexpensive technology for fault detection. Software approaches can play a major role for this sector of the market because they need little hardware modifications and can be tailored to fit different requirements of reliability and performance. However, software approaches add a significant overhead.In this paper we propose two novel techniques that reduce the overhead of software error checking approaches. The first technique uses boolean logic to identify code patterns that correspond to outcome tolerant branches. We develop a compiler algorithm that finds those patterns and removes the unnecessary replicas. In the second technique we evaluate the performance benefit obtained by removing address checks before load and stores. In addition, we evaluate the overheads that can be removed when the register file is protected in hardware.Our experimental results show that the first technique improves performance by an average 7% for three of the SPEC benchmarks. The second technique can reduce overhead by up-to 50% when the most aggressive optimization is applied.