Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
TD Models of reward predictive responses in dopamine neurons
Neural Networks - Computational models of neuromodulation
Hi-index | 0.01 |
This paper presents a biologically constrained reward prediction model capable of learning cue-outcome associations involving temporally distant stimuli without using the commonly used temporal difference model. The model incorporates a novel use of an adapted echo state network to substitute the biologically implausible delay chains usually used, in relation to dopamine phenomena, for tackling temporally structured stimuli. Moreover, the model is based on a novel algorithm which successfully coordinates two sub systems: one providing Pavlovian conditioning, one providing timely inhibition of dopamine responses to salient anticipated stimuli. The model is validated against the typical profile of phasic dopamine in first and second order Pavlovian conditioning. The model is relevant not only to explaining the mechanisms underlying the biological regulation of dopamine signals, but also for applications in autonomous robotics involving reinforcement-based learning.