Planning and acting in partially observable stochastic domains
Artificial Intelligence
Neural control of rhythmic arm movements
Neural Networks - Special issue on neural control and robotics: biology and technology
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
SIAM Journal on Control and Optimization
Reinforcement Learning in Continuous Time and Space
Neural Computation
Reinforcement learning for a CPG-driven biped robot
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Learning CPG-based Biped Locomotion with a Policy Gradient Method: Application to a Humanoid Robot
International Journal of Robotics Research
Policy Learning --- A Unified Perspective with Applications in Robotics
Recent Advances in Reinforcement Learning
Hi-index | 0.00 |
This paper describes a learning framework for a central pattern generator based biped locomotion controller using a policy gradient method. Our goals in this study are to achieve biped walking with a 3D hardware humanoid, and to develop an efficient learning algorithm with CPG by reducing the dimensionality of the state space used for learning. We demonstrate that an appropriate feed-back controller can be acquired within a thousand trials by numerical simulations and the obtained controller in numerical simulation achieves stable walking with a physical robot in the real world. Numerical simulations and hardware experiments evaluated walking velocity and stability. Furthermore, we present the possibility of an additional online learning using a hardware robot to improve the controller within 200 iterations.