A Reinforcement Learning Algorithm for Spiking Neural Networks

Authors:
Razvan V. Florian
Affiliations:
Center for Cognitive and Neural Studies, University of Genoa and Babeş-Bolyai University
Venue:
SYNASC '05 Proceedings of the Seventh International Symposium on Symbolic and Numeric Algorithms for Scientific Computing
Year:
2005

Citing 0
Cited 4

Reinforcement Learning Through Modulation of Spike-Timing-Dependent Synaptic Plasticity

Neural Computation
A Model of Neuronal Specialization Using Hebbian Policy-Gradient with "Slow" Noise

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
Phase precession and recession with STDP and Anti-STDP

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Spiking neural controllers for pushing objects around

SAB'06 Proceedings of the 9th international conference on From Animals to Animats: simulation of Adaptive Behavior

Quantified Score

Hi-index	0.00

Visualization

Abstract

The paper presents a new reinforcement learning mechanism for spiking neural networks. The algorithm is derived for networks of stochastic integrate-and-fire neurons, but it can be also applied to generic spiking neural networks. Learning is achieved by synaptic changes that depend on the firing of pre- and postsynaptic neurons, and that are modulated with a global reinforcement signal. The ef- ficacy of the algorithm is verified in a biologically-inspired experiment, featuring a simulated worm that searches for food. Our model recovers a form of neural plasticity experimentally observed in animals, combining spike-timing-dependentsynaptic changes of one sign with nonassociative synaptic changes of the opposite sign determined by presynaptic spikes. The model also predicts that the time constant of spike-timing-dependent synaptic changes is equal to the membrane time constant of the neuron, in agreement with experimental observations in the brain. This study also led to the discovery of a biologically-plausible reinforcement learning mechanism that works by modulating spike-timing-dependent plasticity (STDP) with a global reward signal.