Multiple paired forward and inverse models for motor control
Neural Networks - Special issue on neural control and robotics: biology and technology
Multiple model-based reinforcement learning
Neural Computation
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Long-term reward prediction in TD models of the dopamine system
Neural Computation
Reinforcement learning models of the dopamine system and their behavioral implications
Reinforcement learning models of the dopamine system and their behavioral implications
Representation and timing in theories of the dopamine system
Neural Computation
Noisy-or nodes for conditioning models
SAB'10 Proceedings of the 11th international conference on Simulation of adaptive behavior: from animals to animats
Hi-index | 0.00 |
A number of computational models have explained the behavior of dopamine neurons in terms of temporal difference learning. However, earlier models cannot account for recent results of conditioning experiments; specifically, the behavior of dopamine neurons in case of variation of the interval between a cue stimulus and a reward has not been satisfyingly accounted for. We address this problem by using a modular architecture, in which each module consists of a reward predictor and a value estimator. A ''responsibility signal'', computed from the accuracy of the predictions of the reward predictors, is used to weight the contributions and learning of the value estimators. This multiple-model architecture gives an accurate account of the behavior of dopamine neurons in two specific experiments: when the reward is delivered earlier than expected, and when the stimulus-reward interval varies uniformly over a fixed range.