Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Mobile Robot Miniaturisation: A Tool for Investigation in Control Algorithms
The 3rd International Symposium on Experimental Robotics III
Towards novel neuroscience-inspired computing
Emergent neural computational architectures based on neuroscience
Actor-Critic Models of Reinforcement Learning in the Basal Ganglia: From Natural to Artificial Rats
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Hi-index | 0.00 |
Neuroscientists have identified a neural substrate of prediction and reward in experiments with primates. The so-called dopamine neurons have been shown to code an error in the temporal prediction of rewards. Similarly, artificial systems can "learn to predict" by the so-called temporal-difference (TD) methods. Based on the general resemblance between the effective reinforcement term of TD models and the response of dopamine neurons, neuroscientists have developed a TD-learning time-delay actor-critic neural model and compared its performance with the behavior of monkeys in the laboratory. We have used such a neural network model to learn to predict variable-delay rewards in a robot spatial choice task similar to the one used by neuroscientists with primates. Such architecture implementing TD-learning appears as a promising mechanism for robotic systems that learn from simple human teaching signals in the real world.