Spare Capacity as a Means of Fault Detection and Diagnosis in Multiprocessor Systems
IEEE Transactions on Computers
A Fault-Tolerant Dynamic Scheduling Algorithm for Multiprocessor Real-Time Systems and Its Analysis
IEEE Transactions on Parallel and Distributed Systems
DIVA: a reliable substrate for deep submicron microarchitecture design
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
Self-checking and fault-tolerant digital design
Self-checking and fault-tolerant digital design
NanoFabrics: spatial computing using molecular electronics
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Towards nanocomputer architecture
CRPIT '02 Proceedings of the seventh Asia-Pacific conference on Computer systems architecture
Computer architecture: a quantitative approach
Computer architecture: a quantitative approach
Fault Tolerance in Multiprocessor Systems Without Dedicated Redundancy
IEEE Transactions on Computers
Roll-Forward Checkpointing Scheme: A Novel Fault-Tolerant Architecture
IEEE Transactions on Computers
IEEE Transactions on Computers
Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic
DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
ITC '00 Proceedings of the 2000 IEEE International Test Conference
Enhanced Cluster k-Ary n-Cube, A Fault-Tolerant Multiprocessor
IEEE Transactions on Computers
Nanowire-based sublithographic programmable logic arrays
FPGA '04 Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays
Characterization of Soft Errors Caused by Single Event Upsets in CMOS Processes
IEEE Transactions on Dependable and Secure Computing
Toward Hardware-Redundant, Fault-Tolerant Logic for Nanoelectronics
IEEE Design & Test
Nanowire-based programmable architectures
ACM Journal on Emerging Technologies in Computing Systems (JETC)
A reconfigurable architecture for hybrid CMOS/Nanodevice circuits
Proceedings of the 2006 ACM/SIGDA 14th international symposium on Field programmable gate arrays
IEEE Transactions on Nanotechnology
A system architecture solution for unreliable nanoelectronic devices
IEEE Transactions on Nanotechnology
Array-based architecture for FET-based, nanoscale electronics
IEEE Transactions on Nanotechnology
Crossbar demultiplexers for nanoelectronics based on n-hot codes
IEEE Transactions on Nanotechnology
Towards achieving reliable and high-performance nanocomputing via dynamic redundancy allocation
ACM Journal on Emerging Technologies in Computing Systems (JETC)
LUT-based FPGA technology mapping for reliability
Proceedings of the 47th Design Automation Conference
Hi-index | 0.00 |
In this paper, we focus on reliability, one of the most fundamental and important challenges, in the nanoelectronics environment. For a processor architecture based on the unreliable nanoelectronic devices, fault tolerance schemes are required so as to ensure the basic correctness of any computation. Since any fault tolerance approach demands redundancy either in the form of time or hardware, reliability needs to be considered in conjunction with the performance and hardware tradeoffs. We propose a new computational model for the nanoelectronics based processor architectures, that provides flexible fault tolerance to deal with the high and time varying faults. The model guarantees the correctness of instruction executions, while dynamically balancing hardware and performance overheads. The correctness of every instruction is confirmed by multiple execution instances through a hybrid hardware-time redundancy approach. To achieve high system performance, multiple unconfirmed computation branches are exploited in a speculative manner. Hardware resource growth that these speculative computations entail is controlled so that the utilization of hardware is balanced between the two competing goals of performance and fault tolerance. In addition, we examine the impact on the proposed computational model of other nanoelectronic characteristics such as the necessity for localization of interconnections and the regularity of nanofabric structures on the proposed computational model. We set up an experimental framework to validate the effectiveness of the proposed scheme as well as to investigate multiple tradeoff points within the proposed approach. Simulation data confirm that the proposed computational model achieves the goal of providing flexible fault tolerance under a wide range of fault occurrence rates, while at the same time guaranteeing high system performance and efficient utilization of hardware resources.