Anticipating Rewards in Continuous Time and Space: A Case Study in Developmental Robotics

Authors:
Arnaud J. Blanchard;Lola Cañamero
Affiliations:
Adaptive System Research Group, School of Computer Science, University of Hertfordshire, College Lane, Hatfield, Herts AL10 9AB, UK;Adaptive System Research Group, School of Computer Science, University of Hertfordshire, College Lane, Hatfield, Herts AL10 9AB, UK
Venue:
Anticipatory Behavior in Adaptive Learning Systems
Year:
2007

Citing 4
Cited 1

Elements of information theory

Elements of information theory
An Behavior-based Robotics

An Behavior-based Robotics
Reinforcement Learning in Continuous Time and Space

Neural Computation
No free lunch theorems for optimization

IEEE Transactions on Evolutionary Computation

Anticipations, Brains, Individual and Social Behavior: An Introduction to Anticipatory Systems

Anticipatory Behavior in Adaptive Learning Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents the first basic principles, implementation and experimental results of what could be regarded as a new approach to reinforcement learning, where agents--physical robots interacting with objects and other agents in the real world--can learn to anticipate rewards using their sensory inputs. Our approach does not need discretization, notion of events, or classification, and instead of learning rewards for the different possible actions of an agent in all the situations, we propose to make agents learn only the main situations worth avoiding and reaching. However, the main focus of our work is not reinforcement learning as such, but modeling cognitive development on a small autonomous robot interacting with an "adult" caretaker, typically a human, in the real world; the control architecture follows a Perception-Action approach incorporating a basic homeostatic principle. This interaction occurs in very close proximity, uses very coarse and limited sensory-motor capabilities, and affects the "well-being" and affective state of the robot. The type of anticipatory behavior we are concerned with in this context relates to both sensory and reward anticipation. We have applied and tested our model on a real robot.