2008 Special Issue: Finding intrinsic rewards by embodied evolution and constrained reinforcement learning

  • Authors:
  • Eiji Uchibe;Kenji Doya

  • Affiliations:
  • Okinawa Institute of Science and Technology, Okinawa 904-2234, Japan;Okinawa Institute of Science and Technology, Okinawa 904-2234, Japan and Nara Institute of Science and Technology, Nara, Japan and ATR Computational Neuroscience laboratories, Japan

  • Venue:
  • Neural Networks
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Understanding the design principle of reward functions is a substantial challenge both in artificial intelligence and neuroscience. Successful acquisition of a task usually requires not only rewards for goals, but also for intermediate states to promote effective exploration. This paper proposes a method for designing 'intrinsic' rewards of autonomous agents by combining constrained policy gradient reinforcement learning and embodied evolution. To validate the method, we use Cyber Rodent robots, in which collision avoidance, recharging from battery packs, and 'mating' by software reproduction are three major 'extrinsic' rewards. We show in hardware experiments that the robots can find appropriate 'intrinsic' rewards for the vision of battery packs and other robots to promote approach behaviors.