Proceedings of the seventh international conference (1990) on Machine learning
Exploration bonuses and dual control
Machine Learning
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
TD Models of reward predictive responses in dopamine neurons
Neural Networks - Computational models of neuromodulation
Opponent interactions between serotonin and dopamine
Neural Networks - Computational models of neuromodulation
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Reinforcement Learning in Continuous Time and Space
Neural Computation
R-MAX: a general polynomial time algorithm for near-optimal reinforcement learning
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Metalearning and neuromodulation
Neural Networks - Computational models of neuromodulation
TD Models of reward predictive responses in dopamine neurons
Neural Networks - Computational models of neuromodulation
Modeling dopamine activity by Reinforcement Learning methods: implications from two recent models
Artificial Intelligence Review
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Intrinsically motivated machines
50 years of artificial intelligence
The neuronal replicator hypothesis
Neural Computation
Prediction error associated with the perceptual segmentation of naturalistic events
Journal of Cognitive Neuroscience
Incremental skill acquisition for self-motivated learning animats
SAB'06 Proceedings of the 9th international conference on From Animals to Animats: simulation of Adaptive Behavior
Active learning of inverse models with intrinsically motivated goal exploration in robots
Robotics and Autonomous Systems
Hi-index | 0.00 |
In the temporal difference model of primate dopamine neurons, their phasic activity reports a prediction error for future reward. This model is supported by a wealth of experimental data. However, in certain circumstances, the activity of the dopamine cells seems anomalous under the model, as they respond in particular ways to stimuli that are not obviously related to predictions of reward. In this paper, we address two important sets of anomalies, those having to do with generalization and novelty. Generalization responses are treated as the natural consequence of partial information; novelty responses are treated by the suggestion that dopamine cells multiplex information about reward bonuses, including exploration bonuses and shaping bonuses. We interpret this additional role for dopamine in terms of the mechanistic attentional and psychomotor effects of dopamine, having the computational role of guiding exploration.