Neural Network Guided Spatial Fault Resilience in Array Processors

Authors:
Suraj Sindia;Vishwani D. Agrawal
Affiliations:
Department of Electrical and Computer Engineering, Auburn University, Alabama, USA 36849;Department of Electrical and Computer Engineering, Auburn University, Alabama, USA 36849
Venue:
Journal of Electronic Testing: Theory and Applications
Year:
2013

Citing 10
Cited 0

Energy-efficient signal processing via algorithmic noise-tolerance

ISLPED '99 Proceedings of the 1999 international symposium on Low power electronics and design
Defect and Error Tolerance in the Presence of Massive Numbers of Defects

IEEE Design & Test
Digital Image Processing (3rd Edition)

Digital Image Processing (3rd Edition)
Energy Aware Computing through Probabilistic Switching: A Study of Limits

IEEE Transactions on Computers
Analysis and Testing for Error Tolerant Motion Estimation

DFT '05 Proceedings of the 20th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems
Hardware Testing For Error Tolerant Multimedia Compression based on Linear Transforms

DFT '05 Proceedings of the 20th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems
Error-resilient motion estimation architecture

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Design perspectives on 22nm CMOS and beyond

Proceedings of the 46th Annual Design Automation Conference
Neural network learning without backpropagation

IEEE Transactions on Neural Networks
Information Content Weighting for Perceptual Image Quality Assessment

IEEE Transactions on Image Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computing with large die-size graphical processors, that involves huge arrays of identical structures, in the late CMOS era is abounding with challenges due to spatial non-idealities arising from chip-to-chip and within-chip variation of MOSFET threshold voltage. In this paper, we propose a software-framework using machine learning for in-situ prediction and correction of computation corrupted due to threshold voltage variation of transistors. Semi-supervised training is imparted to a fully connected cascade feed-forward (FCCFF) neural network (NN). This FCCFF-NN then creates an accurate spatial map of faulty processing elements (PE), which are avoided in computing. Besides correcting spatial faults, any transient errors (such as single-event upsets) are also tracked and corrected if the number of affected PEs is large enough to cause noticeable computing errors. For experimental validation, we consider a 256 脳 256 PE array. Each PE is comprised of add-accumulate-multiply (AAM) block with three 8-bit registers (two for inputs and a third for storing the computed result). One thousand instances of this processor array are created and PEs in each instance are randomly perturbed with threshold voltage variation. Common image processing operations such as low pass filtering and edge enhancement are performed on each of these 1,000 instances. A fraction of these images (about 10 %) is used to train the NN for spatial non-idealities. Based on this training, the NN is able to accurately predict the spatial extremities in 95 % of all the remaining 90 % of the cases. The proposed NN based error tolerance produces superior quality processed images whose degradation is no longer visually perceptible.