Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Robot Learning From Demonstration
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Algorithms for Inverse Reinforcement Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A Framework for Behavioural Cloning
Machine Intelligence 15, Intelligent Agents [St. Catherine's College, Oxford, July 1995]
Imitation in animals and artifacts
Reinforcement Learning in Continuous Time and Space
Neural Computation
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
A survey of robot learning from demonstration
Robotics and Autonomous Systems
ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Kinesthetic bootstrapping: teaching motor skills to humanoid robots through physical interaction
KI'09 Proceedings of the 32nd annual German conference on Advances in artificial intelligence
PEGASUS: a policy search method for large MDPs and POMDPs
UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
On Learning, Representing, and Generalizing a Task in a Humanoid Robot
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Learning to select and generalize striking movements in robot table tennis
International Journal of Robotics Research
Hi-index | 0.00 |
Efficient skill acquisition is crucial for creating versatile robots. One intuitive way to teach a robot new tricks is to demonstrate a task and enable the robot to imitate the demonstrated behavior. This approach is known as imitation learning. Classical methods of imitation learning, such as inverse reinforcement learning or behavioral cloning, suffer substantially from the correspondence problem when the actions (i.e. motor commands, torques or forces) of the teacher are not observed or the body of the teacher differs substantially, e.g., in the actuation. To address these drawbacks we propose to learn a robot-specific controller that directly matches robot trajectories with observed ones. We present a novel and robust probabilistic model-based approach for solving a probabilistic trajectory matching problem via policy search. For this purpose, we propose to learn a probabilistic model of the system, which we exploit for mental rehearsal of the current controller by making predictions about future trajectories. These internal simulations allow for learning a controller without permanently interacting with the real system, which results in a reduced overall interaction time. Using long-term predictions from this learned model, we train robot-specific controllers that reproduce the expert's distribution of demonstrations without the need to observe motor commands during the demonstration. The strength of our approach is that it addresses the correspondence problem in a principled way. Our method achieves a higher learning speed than both model-based imitation learning based on dynamics motor primitives and trial-and-error-based learning systems with hand-crafted cost functions. We successfully applied our approach to imitating human behavior using a tendon-driven compliant robotic arm. Moreover, we demonstrate the generalization ability of our approach in a multi-task learning setup.