Policy Learning --- A Unified Perspective with Applications in Robotics

Authors:
Jan Peters;Jens Kober;Duy Nguyen-Tuong
Affiliations:
Max-Planck Institute for Biological Cybernetics, Tübingen 72074 and University of Southern California, Los Angeles, USA CA 90089;Max-Planck Institute for Biological Cybernetics, Tübingen 72074;Max-Planck Institute for Biological Cybernetics, Tübingen 72074
Venue:
Recent Advances in Reinforcement Learning
Year:
2008

Citing 6
Cited 0

Using expectation-maximization for reinforcement learning

Neural Computation
Reinforcement Learning for Biped Locomotion

ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
Fast Biped Walking with a Sensor-driven Neuronal Controller and Real-time Online Learning

International Journal of Robotics Research
Reinforcement learning for a CPG-driven biped robot

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Learning CPG sensory feedback with policy gradient for biped locomotion for a full-body humanoid

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Natural actor-critic

ECML'05 Proceedings of the 16th European conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Policy Learning approaches are among the best suited methods for high-dimensional, continuous control systems such as anthropomorphic robot arms and humanoid robots. In this paper, we show two contributions: firstly, we show a unified perspective which allows us to derive several policy learning algorithms from a common point of view, i.e, policy gradient algorithms, natural-gradient algorithms and EM-like policy learning. Secondly, we present several applications to both robot motor primitive learning as well as to robot control in task space. Results both from simulation and several different real robots are shown.