The emergence of saliency and novelty responses from Reinforcement Learning principles

Authors:
Patryk A. Laurent
Affiliations:
University of Pittsburgh, Centers for Neuroscience and for the Neural Basis of Cognition, 623 LRDC, 3939 O'Hara St., Pittsburgh, PA 15260, USA
Venue:
Neural Networks
Year:
2008

Citing 2
Cited 3

Metalearning and neuromodulation

Neural Networks - Computational models of neuromodulation
Dopamine: generalization and bonuses

Neural Networks - Computational models of neuromodulation

2009 Special Issue: Goal-directed learning of features and forward models

Neural Networks
2009 Special Issue: Computational perspectives on forebrain microcircuits implicated in reinforcement learning, action selection, and cognitive control

Neural Networks
Computational implications of microcircuit specializations in forebrain circuits for motivated action selection

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent attempts to map reward-based learning models, like Reinforcement Learning [Sutton, R. S., & Barto, A. G. (1998). Reinforcement Learning: An introduction. Cambridge, MA: MIT Press], to the brain are based on the observation that phasic increases and decreases in the spiking of dopamine-releasing neurons signal differences between predicted and received reward [Gillies, A., & Arbuthnott, G. (2000). Computational models of the basal ganglia. Movement Disorders, 15(5), 762-770; Schultz, W. (1998). Predictive reward signal of dopamine neurons. Journal of Neurophysiology, 80(1), 1-27]. However, this reward-prediction error is only one of several signals communicated by that phasic activity; another involves an increase in dopaminergic spiking, reflecting the appearance of salient but unpredicted non-reward stimuli [Doya, K. (2002). Metalearning and neuromodulation. Neural Networks, 15(4-6), 495-506; Horvitz, J. C. (2000). Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events. Neuroscience, 96(4), 651-656; Redgrave, P., & Gurney, K. (2006). The short-latency dopamine signal: A role in discovering novel actions? Nature Reviews Neuroscience, 7(12), 967-975], especially when an organism subsequently orients towards the stimulus [Schultz, W. (1998). Predictive reward signal of dopamine neurons. Journal of Neurophysiology, 80(1), 1-27]. To explain these findings, Kakade and Dayan [Kakade, S., & Dayan, P. (2002). Dopamine: Generalization and bonuses. Neural Networks, 15(4-6), 549-559.] and others have posited that novel, unexpected stimuli are intrinsically rewarding. The simulation reported in this article demonstrates that this assumption is not necessary because the effect it is intended to capture emerges from the reward-prediction learning mechanisms of Reinforcement Learning. Thus, Reinforcement Learning principles can be used to understand not just reward-related activity of the dopaminergic neurons of the basal ganglia, but also some of their apparently non-reward-related activity.