A watchdog processor based general rollback technique with multiple retries
IEEE Transactions on Software Engineering
Processor Control Flow Monitoring Using Signatured Instruction Streams
IEEE Transactions on Computers
DIVA: a reliable substrate for deep submicron microarchitecture design
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Transient fault detection via simultaneous multithreading
Proceedings of the 27th annual international symposium on Computer architecture
Efficient checker processor design
Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Transient-fault recovery using simultaneous multithreading
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Detailed design and evaluation of redundant multithreading alternatives
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Dual use of superscalar datapath for transient-fault detection and recovery
Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
Concurrent Error Detection Using Watchdog Processors-A Survey
IEEE Transactions on Computers
A study of time redundant fault tolerance techniques for superscalar processors
DFT '95 Proceedings of the IEEE International Workshop on Defect and Fault Tolerance in VLSI Systems
Temperature-aware microarchitecture
Proceedings of the 30th annual international symposium on Computer architecture
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Characterizing the Effects of Transient Faults on a High-Performance Processor Pipeline
DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
Efficient Resource Sharing in Concurrent Error Detecting Superscalar Microarchitectures
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Opportunistic Transient-Fault Detection
Proceedings of the 32nd annual international symposium on Computer Architecture
The Filter Checker: An Active Verification Management Approach
DFT '06 Proceedings of the 21st IEEE International Symposium on on Defect and Fault-Tolerance in VLSI Systems
A survey of checker architectures
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
Dynamic verification using the checker processor introduces severe degradation in performance unless the checker is as fast as the main processor core. Without widening the checker's bandwidth, we propose an active verification management (AVM) approach that utilizes a checker hierarchy. Before an instruction is verified at the checker processor, a filter checker marks a correctness non-criticality indicator (CNI) bit to indicate how likely its result is to be unimportant for reliability. AVM uses the CNI information to realize a congestion avoidance policy. Both reactive and proactive congestion avoidance policies are proposed to mitigate the performance degradation caused by the checker's congestion. Based on a simplified queueing model, we evaluate the proposed AVM analytically. Our experimental results show that AVM has the potential to solve the verification congestion problem when perfect fault coverage is not needed. With no AVM, congestion at the checker badly affects performance, to the tune of 57%, when compared to that of a non-fault-tolerant processor. With good marking by AVM, the performance of a reliable processor approaches 95% of that of a processor with no verification. Although instructions can be skipped on a random basis, such an approach reduces the fault coverage. A filter checker with a marking policy correlated with the correctness non-criticality metric, on the other hand, significantly reduces the soft error rate. Finally, we also present results showing the trade-off between performance and reliability.