A case for neuromorphic ISAs

Authors:
Atif Hashmi;Andrew Nere;James Jamal Thomas;Mikko Lipasti
Affiliations:
University of Wisconsin-Madison, Madison, WI, USA;University of Wisconsin-Madison, Madison, WI, USA;University of Wisconsin-Madison, Madison, WI, USA;University of Wisconsin-Madison, Madison, WI, USA
Venue:
Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Year:
2011

Citing 11
Cited 3

Cortical columns, modules, and Hebbian cell assemblies

The handbook of brain theory and neural networks
Learning optimized features for hierarchical models of invariant object recognition

Neural Computation
The blue brain project

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Robust Object Recognition with Cortex-Like Mechanisms

IEEE Transactions on Pattern Analysis and Machine Intelligence
Competitive collaborative learning

Journal of Computer and System Sciences
Neural Network Implementation Using CUDA and OpenMP

DICTA '08 Proceedings of the 2008 Digital Image Computing: Techniques and Applications
Scaling analysis of a neocortex inspired cognitive model on the Cray XD1

The Journal of Supercomputing
Large-scale deep unsupervised learning using graphics processors

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Architecture of the IBM system/360

IBM Journal of Research and Development
Cortical architectures on a GPGPU

Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units
Which model to use for cortical spiking neurons?

IEEE Transactions on Neural Networks

Neural Acceleration for General-Purpose Approximate Programs

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Continuous real-world inputs can open up alternative accelerator designs

Proceedings of the 40th Annual International Symposium on Computer Architecture
DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning

Proceedings of the 19th international conference on Architectural support for programming languages and operating systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The desire to create novel computing systems, paired with recent advances in neuroscientific understanding of the brain, has led researchers to develop neuromorphic architectures that emulate the brain. To date, such models are developed, trained, and deployed on the same substrate. However, excessive co-dependence between the substrate and the algorithm prevents portability, or at the very least requires reconstructing and retraining the model whenever the substrate changes. This paper proposes a well-defined abstraction layer -- the Neuromorphic instruction set architecture, or NISA -- that separates a neural application's algorithmic specification from the underlying execution substrate, and describes the Aivo framework, which demonstrates the concrete advantages of such an abstraction layer. Aivo consists of a NISA implementation for a rate-encoded neuromorphic system based on the cortical column abstraction, a state-of-the-art integrated development and runtime environment (IDE), and various profile-based optimization tools. Aivo's IDE generates code for emulating cortical networks on the host CPU, multiple GPGPUs, or as boolean functions. Its runtime system can deploy and adaptively optimize cortical networks in a manner similar to conventional just-in-time compilers in managed runtime systems (e.g. Java, C#). We demonstrate the abilities of the NISA abstraction by constructing a cortical network model of the mammalian visual cortex, deploying on multiple execution substrates, and utilizing the various optimization tools we have created. For this hierarchical configuration, Aivo's profiling based network optimization tools reduce the memory footprint by 50% and improve the execution time by a factor of 3x on the host CPU. Deploying the same network on a single GPGPU results in a 30x speedup. We further demonstrate that a speedup of 480x can be achieved by deploying a massively scaled cortical network across three GPGPUs. Finally, converting a trained hierarchical network to C/C++ boolean constructs on the host CPU results in 44x speedup.