Dopamine: generalization and bonuses

Authors:
Sham Kakade;Peter Dayan
Affiliations:
Gatsby Computational Neuroscience Unit, University College London, 17 Queen Square, London WCIN 3AR, UK;Gatsby Computational Neuroscience Unit, University College London, 17 Queen Square, London WCIN 3AR, UK
Venue:
Neural Networks - Computational models of neuromodulation
Year:
2002

Citing 11
Cited 15

Neural dynamics of adaptive timing temporal discrimination during associative learning

Neural Networks
Integrated architecture for learning, planning, and reacting based on approximating dynamic programming

Proceedings of the seventh international conference (1990) on Machine learning
Exploration bonuses and dual control

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
TD Models of reward predictive responses in dopamine neurons

Neural Networks - Computational models of neuromodulation
Opponent interactions between serotonin and dopamine

Neural Networks - Computational models of neuromodulation
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Reinforcement Learning in Continuous Time and Space

Neural Computation
R-MAX: a general polynomial time algorithm for near-optimal reinforcement learning

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2

Metalearning and neuromodulation

Neural Networks - Computational models of neuromodulation
TD Models of reward predictive responses in dopamine neurons

Neural Networks - Computational models of neuromodulation
Modeling dopamine activity by Reinforcement Learning methods: implications from two recent models

Artificial Intelligence Review
The emergence of saliency and novelty responses from Reinforcement Learning principles

Neural Networks
2009 Special Issue: Computational perspectives on forebrain microcircuits implicated in reinforcement learning, action selection, and cognitive control

Neural Networks
Computational implications of microcircuit specializations in forebrain circuits for motivated action selection

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Intrinsically motivated machines

50 years of artificial intelligence
The neuronal replicator hypothesis

Neural Computation
Integration of reinforcement learning and optimal decision-making theories of the basal ganglia

Neural Computation
Prediction error associated with the perceptual segmentation of naturalistic events

Journal of Cognitive Neuroscience
Incremental skill acquisition for self-motivated learning animats

SAB'06 Proceedings of the 9th international conference on From Animals to Animats: simulation of Adaptive Behavior
Active learning of inverse models with intrinsically motivated goal exploration in robots

Robotics and Autonomous Systems
Phasic dopamine as a prediction error of intrinsic and extrinsic reinforcements driving both action acquisition and reward maximization: A simulated robotic study

Neural Networks
2013 Special Issue: Intrinsically motivated action-outcome learning and goal-based action recall: A system-level bio-constrained computational model

Neural Networks
2013 Special Issue: Modulation for emergent networks: Serotonin and dopamine

Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the temporal difference model of primate dopamine neurons, their phasic activity reports a prediction error for future reward. This model is supported by a wealth of experimental data. However, in certain circumstances, the activity of the dopamine cells seems anomalous under the model, as they respond in particular ways to stimuli that are not obviously related to predictions of reward. In this paper, we address two important sets of anomalies, those having to do with generalization and novelty. Generalization responses are treated as the natural consequence of partial information; novelty responses are treated by the suggestion that dopamine cells multiplex information about reward bonuses, including exploration bonuses and shaping bonuses. We interpret this additional role for dopamine in terms of the mechanistic attentional and psychomotor effects of dopamine, having the computational role of guiding exploration.