DIVA: a reliable substrate for deep submicron microarchitecture design
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Multiple instruction issue in the NonStop cyclone processor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Transient fault detection via simultaneous multithreading
Proceedings of the 27th annual international symposium on Computer architecture
Transient-fault recovery using simultaneous multithreading
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Detailed design and evaluation of redundant multithreading alternatives
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Dual use of superscalar datapath for transient-fault detection and recovery
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
IBM's S/390 G5 Microprocessor Design
IEEE Micro
Concurrent Error Detection Using Watchdog Processors-A Survey
IEEE Transactions on Computers
Design Considerations in Boeing 777 Fly-By-Wire Computers
HASE '98 The 3rd IEEE International Symposium on High-Assurance Systems Engineering
Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors
FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
Transient-fault recovery for chip multiprocessors
Proceedings of the 30th annual international symposium on Computer architecture
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Y-Branches: When You Come to a Fork in the Road, Take It
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Cache Scrubbing in Microprocessors: Myth or Necessity?
PRDC '04 Proceedings of the 10th IEEE Pacific Rim International Symposium on Dependable Computing (PRDC'04)
PIN: a binary instrumentation tool for computer architecture research and education
WCAE '04 Proceedings of the 2004 workshop on Computer architecture education: held in conjunction with the 31st International Symposium on Computer Architecture
Design and Evaluation of Hybrid Fault-Detection Systems
Proceedings of the 32nd annual international symposium on Computer Architecture
Proceedings of the 12th ACM conference on Computer and communications security
Fault Tolerance Techniques for the Merrimac Streaming Supercomputer
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Software-controlled fault tolerance
ACM Transactions on Architecture and Code Optimization (TACO)
ExtraVirt: detecting and recovering from transient processor faults
Proceedings of the twentieth ACM symposium on Operating systems principles
Software-Based Transparent and Comprehensive Control-Flow Error Detection
Proceedings of the International Symposium on Code Generation and Optimization
Dynamic binary control-flow errors detection
ACM SIGARCH Computer Architecture News - Special issue on the 2005 workshop on binary instrumentation and application
Static typing for a faulty lambda calculus
Proceedings of the eleventh ACM SIGPLAN international conference on Functional programming
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Dynamic prediction of architectural vulnerability from microarchitectural state
Proceedings of the 34th annual international symposium on Computer architecture
Exterminator: automatically correcting memory errors with high probability
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Fault-tolerant typed assembly language
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
SuperPin: Parallelizing Dynamic Instrumentation for Real-Time Performance
Proceedings of the International Symposium on Code Generation and Optimization
Compiler-Managed Software-based Redundant Multi-Threading for Transient Fault Detection
Proceedings of the International Symposium on Code Generation and Optimization
Proceedings of the conference on Design, automation and test in Europe
Transient fault prediction based on anomalies in processor events
Proceedings of the conference on Design, automation and test in Europe
Proceedings of the 2007 international workshop on Parallel symbolic computation
Pipa: pipelined profiling and analysis on multi-core systems
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Anomaly-based fault detection in pervasive computing system
Proceedings of the 5th international conference on Pervasive services
Software protection mechanisms for dependable systems
Proceedings of the conference on Design, automation and test in Europe
Reasoning about Control Flow in the Presence of Transient Faults
SAS '08 Proceedings of the 15th international symposium on Static Analysis
Techniques for Efficient Software Checking
Languages and Compilers for Parallel Computing
Understanding software approaches for GPGPU reliability
Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units
Self-recovery in server programs
Proceedings of the 2009 international symposium on Memory management
A compiler optimization to reduce soft errors in register files
Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
ESoftCheck: Removal of Non-vital Checks for Fault Tolerance
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Compiler-assisted soft error detection under performance and energy constraints in embedded systems
ACM Transactions on Embedded Computing Systems (TECS)
End-to-end register data-flow continuous self-test
Proceedings of the 36th annual international symposium on Computer architecture
Instruction-Level Fault Tolerance Configurability
Journal of Signal Processing Systems
Control-flow integrity principles, implementations, and applications
ACM Transactions on Information and System Security (TISSEC)
Architecture Design for Soft Errors
Architecture Design for Soft Errors
Selective replication: A lightweight technique for soft errors
ACM Transactions on Computer Systems (TOCS)
On-line control flow error detection using relationship signatures among basic blocks
Computers and Electrical Engineering
Shoestring: probabilistic soft error reliability on the cheap
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Cache vulnerability equations for protecting data in embedded processor caches from soft errors
Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems
Software adaptation in quality sensitive applications to deal with hardware variability
Proceedings of the 20th symposium on Great lakes symposium on VLSI
A compiler-based infrastructure for fault-tolerant co-design
Proceedings of the 13th International Workshop on Software & Compilers for Embedded Systems
Using hardware vulnerability factors to enhance AVF analysis
Proceedings of the 37th annual international symposium on Computer architecture
Partitioning techniques for partially protected caches in resource-constrained embedded systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
A compiler-microarchitecture hybrid approach to soft error reduction for register files
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
DAFT: decoupled acyclic fault tolerance
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
PiPA: Pipelined profiling and analysis on multicore systems
ACM Transactions on Architecture and Code Optimization (TACO)
Compiler-support for robust multi-core computing
ISoLA'10 Proceedings of the 4th international conference on Leveraging applications of formal methods, verification, and validation - Volume Part I
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Exploring the Limitations of Software-based Techniques in SEE Fault Coverage
Journal of Electronic Testing: Theory and Applications
Reliable software for unreliable hardware: embedded code generation aiming at reliability
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Automated application of fault tolerance mechanisms in a component-based system
Proceedings of the 9th International Workshop on Java Technologies for Real-Time and Embedded Systems
Soft core based embedded systems in critical aerospace applications
Journal of Systems Architecture: the EUROMICRO Journal
Harnessing self-modifying code for resilient software
WRAC'05 Proceedings of the Second international conference on Radical Agent Concepts: innovative Concepts for Autonomic and Agent-Based Systems
Accelerating microprocessor silicon validation by exposing ISA diversity
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Encore: low-cost, fine-grained transient fault recovery
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Formal development of mechanisms for tolerating transient faults
Rigorous Development of Complex Fault-Tolerant Systems
Faulty logic: reasoning about fault tolerant programs
ESOP'10 Proceedings of the 19th European conference on Programming Languages and Systems
Evaluation of the ability to transform SIM applications into hostile applications
CARDIS'11 Proceedings of the 10th IFIP WG 8.8/11.2 international conference on Smart Card Research and Advanced Applications
Reliability-aware core partitioning in chip multiprocessors
Journal of Systems Architecture: the EUROMICRO Journal
Fault Resilient Real-Time Design for NoC Architectures
ICCPS '12 Proceedings of the 2012 IEEE/ACM Third International Conference on Cyber-Physical Systems
A tunable, software-based DRAM error detection and correction library for HPC
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
Efficient soft error protection for commodity embedded microprocessors using profile information
Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems
Static analysis and compiler design for idempotent processing
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Runtime asynchronous fault tolerance via speculation
Proceedings of the Tenth International Symposium on Code Generation and Optimization
Evaluating operating system vulnerability to memory errors
Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
Setting an error detection infrastructure with low cost acoustic wave detectors
Proceedings of the 39th Annual International Symposium on Computer Architecture
The Journal of Supercomputing
Practical hardening of crash-tolerant systems
USENIX ATC'12 Proceedings of the 2012 USENIX conference on Annual Technical Conference
Thread vulnerability in parallel applications
Journal of Parallel and Distributed Computing
Operating system support for redundant multithreading
Proceedings of the tenth ACM international conference on Embedded software
Time-Constraint-Aware Optimization of Assertions in Embedded Software
Journal of Electronic Testing: Theory and Applications
Who watches the watchmen? - protecting operating system reliability mechanisms
HotDep'12 Proceedings of the Eighth USENIX conference on Hot Topics in System Dependability
Dynamic code duplication with vulnerability awareness for soft error detection on VLIW architectures
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
SAFECOMP'12 Proceedings of the 31st international conference on Computer Safety, Reliability, and Security
Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Low cost control flow protection using abstract control signatures
Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
An instruction-level fine-grained recovery approach for soft errors
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Comprehensive analysis of software countermeasures against fault attacks
Proceedings of the Conference on Design, Automation and Test in Europe
CSER: HW/SW configurable soft-error resiliency for application specific instruction-set processors
Proceedings of the Conference on Design, Automation and Test in Europe
Improving fault tolerance utilizing hardware-software-co-synthesis
Proceedings of the Conference on Design, Automation and Test in Europe
AppAdapt: opportunistic application adaptation in presence of hardware variation
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Performance-reliability tradeoff analysis for multithreaded applications
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
A survey of checker architectures
ACM Computing Surveys (CSUR)
Verifying quantitative reliability for programs that execute on unreliable hardware
Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
Fault tolerant embedded systems design by multi-objective optimization
Expert Systems with Applications: An International Journal
Towards transparent hardening of distributed systems
Proceedings of the 9th Workshop on Hot Topics in Dependable Systems
Hardware trojan resistant computation using heterogeneous COTS processors
ACSC '13 Proceedings of the Thirty-Sixth Australasian Computer Science Conference - Volume 135
Software-based register file vulnerability reduction for embedded processors
ACM Transactions on Embedded Computing Systems (TECS) - Special Section on ESTIMedia'10
Control-flow integrity principles, implementations, and applications
ACM Transactions on Information and System Security (TISSEC)
Journal of Electronic Testing: Theory and Applications
A dynamic approach to tolerate soft errors
Cluster Computing
Epipe: A low-cost fault-tolerance technique considering WCET constraints
Journal of Systems Architecture: the EUROMICRO Journal
A survey of cross-layer power-reliability tradeoffs in multi and many core systems-on-chip
Microprocessors & Microsystems
On-chip sensor networks for soft-error tolerant real-time multiprocessor systems-on-chip
ACM Journal on Emerging Technologies in Computing Systems (JETC)
Hi-index | 0.01 |
To improve performance and reduce power, processor designers employ advances that shrink feature sizes, lower voltage levels, reduce noise margins, and increase clock rates. However, these advances make processors more susceptible to transient faults that can affect correctness. While reliable systems typically employ hardware techniques to address soft-errors, software techniques can provide a lower-cost and more flexible alternative. This paper presents a novel, software-only, transient-fault-detection technique, called SWIFT. SWIFT efficiently manages redundancy by reclaiming unused instruction-level resources present during the execution of most programs. SWIFT also provides a high level of protection and performance with an enhanced control-flow checking mechanism. We evaluate an implementation of SWIFT on an Itanium 2 which demonstrates exceptional fault coverage with a reasonable performance cost. Compared to the best known single-threaded approach utilizing an ECC memory system, SWIFT demonstrates a 51% average speedup.