Timed delivery of reward signals in an autonomous robot

Authors:
William H. Alexander;Olaf Sporns
Affiliations:
Department of Psychology, Indiana University, Bloomington, IN;Department of Psychology, Indiana University, Bloomington, IN
Venue:
ICSAB Proceedings of the seventh international conference on simulation of adaptive behavior on From animals to animats
Year:
2002

Citing 8
Cited 0

Operant conditioning in skinnerbots

Adaptive Behavior - Special issue on environment structure and behavior
Computational models of neuromodulation

Neural Computation
Understanding intelligence

Understanding intelligence
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Learning and Evolution

Autonomous Robots
Neuromodulation and Plasticity in an autonomous robot

Neural Networks - Computational models of neuromodulation
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Temporal Difference Model Reproduces Anticipatory Neural Activity

Neural Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we implement a computational model of a neuromodulatory system in an autonomous robot. The model is based on a set of anatomical and physiological properties of the mammalian dopamine system, one of several diffuse ascending systems of the brain. The output of this system acts as a value signal, which modulates widely distributed synaptic changes in sensory and motor areas. During reward conditioning, the model learns to generate tonic and phasic responses, which are consistent with potential roles as reward predictions and prediction errors. Different sets of neural units generate precisely timed signals that exert positive effects (predictive) and negative effects (if a predicted reward is omitted) on neuroplasticity. We test the learning and behavior of the robot in different environmental contexts, and observe changes in the development of neural connections within the neuromodulatory system that depend on the robot's interaction with the environment. Simulation of a computational model responsive to rewarding stimuli leads to the emergence of conditioned appetitive behaviors. These studies represent a step towards investigating the behavior of autonomous robots controlled by biologically based neuromodulatory systems.