Timed delivery of reward signals in an autonomous robot

  • Authors:
  • William H. Alexander;Olaf Sporns

  • Affiliations:
  • Department of Psychology, Indiana University, Bloomington, IN;Department of Psychology, Indiana University, Bloomington, IN

  • Venue:
  • ICSAB Proceedings of the seventh international conference on simulation of adaptive behavior on From animals to animats
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we implement a computational model of a neuromodulatory system in an autonomous robot. The model is based on a set of anatomical and physiological properties of the mammalian dopamine system, one of several diffuse ascending systems of the brain. The output of this system acts as a value signal, which modulates widely distributed synaptic changes in sensory and motor areas. During reward conditioning, the model learns to generate tonic and phasic responses, which are consistent with potential roles as reward predictions and prediction errors. Different sets of neural units generate precisely timed signals that exert positive effects (predictive) and negative effects (if a predicted reward is omitted) on neuroplasticity. We test the learning and behavior of the robot in different environmental contexts, and observe changes in the development of neural connections within the neuromodulatory system that depend on the robot's interaction with the environment. Simulation of a computational model responsive to rewarding stimuli leads to the emergence of conditioned appetitive behaviors. These studies represent a step towards investigating the behavior of autonomous robots controlled by biologically based neuromodulatory systems.