A unified systolic architecture for artificial neural networks
Journal of Parallel and Distributed Computing - Neural Computing
Learning internal representations by error propagation
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
To compute numerically: Concepts and strategies (Little, Brown computer systems series)
To compute numerically: Concepts and strategies (Little, Brown computer systems series)
Sigmoid Generators for Neural Computing Using Piecewise Approximations
IEEE Transactions on Computers
Accuracy vs. Precision in Digital VLSI Architectures for Signal Processing
IEEE Transactions on Computers
Randomized Algorithms: A System-Level, Poly-Time Analysis of Robust Computation
IEEE Transactions on Computers
A fast-multiplier generator for FPGAs
VLSID '95 Proceedings of the 8th International Conference on VLSI Design
An FPGA-based face detector using neural network and a scalable floating point unit
CSECS'06 Proceedings of the 5th WSEAS International Conference on Circuits, Systems, Electronics, Control & Signal Processing
Scalable architecture for on-chip neural network training using swarm intelligence
Proceedings of the conference on Design, automation and test in Europe
Analysis on Bidirectional Associative Memories with Multiplicative Weight Noise
Neural Information Processing
EURASIP Journal on Embedded Systems - FPGA supercomputing platforms, architectures, and techniques for accelerating computationally complex algorithms
IEEE Transactions on Neural Networks
Evolvable block-based neural network design for applications in dynamic environments
VLSI Design - Special issue on selected papers from the midwest symposium on circuits and systems
ICONIP'06 Proceedings of the 13th international conference on Neural information processing - Volume Part III
An efficient hardware architecture for a neural network activation function generator
ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part III
A defect-tolerant accelerator for emerging high-performance applications
Proceedings of the 39th Annual International Symposium on Computer Architecture
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
Hi-index | 14.99 |
Through parallel processing, low precision fixed point hardware can be used to build a very high speed neural network computing engine where the low precision results in a drastic reduction in system cost. The reduced silicon area required to implement a single processing unit is taken advantage of by implementing multiple processing units on a single piece of silicon and operating them in parallel. The important question which arises is how much precision is required to implement neural network algorithms on this low precision hardware. A theoretical analysis of error due to finite precision computation was undertaken to determine the necessary precision for successful forward retrieving and back-propagation learning in a multilayer perceptron. This analysis can easily be further extended to provide a general finite precision analysis technique by which most neural network algorithms under any set of hardware constraints may be evaluated.