A drive-reinforcement model of single neuron function: An alternative to the Hebbian neuronal model
AIP Conference Proceedings 151 on Neural Networks for Computing
AIP Conference Proceedings 151 on Neural Networks for Computing
The Convergence of TD(λ) for General λ
Machine Learning
TD-Gammon, a self-teaching backgammon program, achieves master-level play
Neural Computation
TD(λ) Converges with Probability 1
Machine Learning
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
Metalearning and neuromodulation
Neural Networks - Computational models of neuromodulation
Dopamine-dependent plasticity of corticostriatal synapses
Neural Networks - Computational models of neuromodulation
Actor-critic models of the basal ganglia: new anatomical and computational perspectives
Neural Networks - Computational models of neuromodulation
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Isotropic sequence order learning
Neural Computation
SIAM Journal on Control and Optimization
Temporal Difference Model Reproduces Anticipatory Neural Activity
Neural Computation
Spike-Timing-Dependent Hebbian Plasticity as Temporal Difference Learning
Neural Computation
Reinforcement Learning in Continuous Time and Space
Neural Computation
Programmable Logic Construction Kits for Hyper-Real-Time Neuronal Modeling
Neural Computation
Policy Gradient in Continuous Time
The Journal of Machine Learning Research
Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule
Neural Computation
Phenomenological models of synaptic plasticity based on spike timing
Biological Cybernetics - Special Issue: Object Localization
Interconnecting VLSI spiking neural networks using isochronous connections
IWANN'07 Proceedings of the 9th international work conference on Artificial neural networks
ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
Compositionality of arm movements can be realized by propagating synchrony
Journal of Computational Neuroscience
Solving the distal reward problem with rare correlations
Neural Computation
Hi-index | 0.00 |
The ability to adapt behavior to maximize reward as a result of interactions with the environment is crucial for the survival of any higher organism. In the framework of reinforcement learning, temporal-difference learning algorithms provide an effective strategy for such goal-directed adaptation, but it is unclear to what extent these algorithms are compatible with neural computation. In this article, we present a spiking neural network model that implements actor-critic temporal-difference learning by combining local plasticity rules with a global reward signal. The network is capable of solving a nontrivial gridworld task with sparse rewards. We derive a quantitative mapping of plasticity parameters and synaptic weights to the corresponding variables in the standard algorithmic formulation and demonstrate that the network learns with a similar speed to its discrete time counterpart and attains the same equilibrium performance.