Learning CPG sensory feedback with policy gradient for biped locomotion for a full-body humanoid

  • Authors:
  • Gen Endo;Jun Morimoto;Takamitsu Matsubara;Jun Nakanishi;Gordon Cheng

  • Affiliations:
  • Sony Intelligence Dynamics Laboratories, Inc., Shinagawa-ku, Tokyo, Japan and ATR Computational Neuroscience Laboratories, Soraku-gun, Kyoto, Japan;ATR Computational Neuroscience Laboratories, Shinagawa-ku, Kyoto, Japan and Computational Brain Project, ICORP, Japan Science and Technology Agency, Soraku-gun, Kyoto, Japan;ATR Computational Neuroscience Laboratories, Soraku-gun, Kyoto, Japan and Nara Institute of Science and Technology, Ikoma-shi, Nara, Japan;ATR Computational Neuroscience Laboratories, Soraku-gun, Kyoto, Japan and Computational Brain Project, ICORP, Japan Science and Technology Agency, Soraku-gun, Kyoto, Japan;ATR Computational Neuroscience Laboratories, Soraku-gun, Kyoto, Japan and Computational Brain Project, ICORP, Japan Science and Technology Agency, Soraku-gun, Kyoto, Japan

  • Venue:
  • AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a learning framework for a central pattern generator based biped locomotion controller using a policy gradient method. Our goals in this study are to achieve biped walking with a 3D hardware humanoid, and to develop an efficient learning algorithm with CPG by reducing the dimensionality of the state space used for learning. We demonstrate that an appropriate feed-back controller can be acquired within a thousand trials by numerical simulations and the obtained controller in numerical simulation achieves stable walking with a physical robot in the real world. Numerical simulations and hardware experiments evaluated walking velocity and stability. Furthermore, we present the possibility of an additional online learning using a hardware robot to improve the controller within 200 iterations.