Compliant skills acquisition and multi-optima policy search with EM-based reinforcement learning

Authors:
Sylvain Calinon;Petar Kormushev;Darwin G. Caldwell
Affiliations:
-;-;-
Venue:
Robotics and Autonomous Systems
Year:
2013

Citing 16
Cited 0

The control of hand equilibrium trajectories in multi-joint arm movements

Biological Cybernetics
Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

Machine Learning
Using expectation-maximization for reinforcement learning

Neural Computation
The Cross Entropy Method: A Unified Approach To Combinatorial Optimization, Monte-carlo Simulation (Information Science and Statistics)

The Cross Entropy Method: A Unified Approach To Combinatorial Optimization, Monte-carlo Simulation (Information Science and Statistics)
Teachable robots: Understanding human teaching behavior to build more effective robot learners

Artificial Intelligence
Natural Actor-Critic

Neurocomputing
A survey of robot learning from demonstration

Robotics and Autonomous Systems
Emerging motor behaviors: Learning joint coordination in articulated mobile robots

Neurocomputing
Biologically-inspired dynamical systems for movement generation: automatic real-time goal adaptation and obstacle avoidance

ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Impedance learning for robotic contact tasks using natural actor-critic algorithm

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
A Generalized Path Integral Control Approach to Reinforcement Learning

The Journal of Machine Learning Research
Learning variable impedance control

International Journal of Robotics Research
Dynamical System Modulation for Robot Learning via Kinesthetic Demonstrations

IEEE Transactions on Robotics
On Learning, Representing, and Generalizing a Task in a Humanoid Robot

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Cross-entropy motion planning

International Journal of Robotics Research
Learning Stable Nonlinear Dynamical Systems With Gaussian Mixture Models

IEEE Transactions on Robotics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The democratization of robotics technology and the development of new actuators progressively bring robots closer to humans. The applications that can now be envisaged drastically contrast with the requirements of industrial robots. In standard manufacturing settings, the criterions used to assess performance are usually related to the robot's accuracy, repeatability, speed or stiffness. Learning a control policy to actuate such robots is characterized by the search of a single solution for the task, with a representation of the policy consisting of moving the robot through a set of points to follow a trajectory. With new environments such as homes and offices populated with humans, the reproduction performance is portrayed differently. These robots are expected to acquire rich motor skills that can be generalized to new situations, while behaving safely in the vicinity of users. Skills acquisition can no longer be guided by a single form of learning, and must instead combine different approaches to continuously create, adapt and refine policies. The family of search strategies based on expectation-maximization (EM) looks particularly promising to cope with these new requirements. The exploration can be performed directly in the policy parameters space, by refining the policy together with exploration parameters represented in the form of covariances. With this formulation, RL can be extended to a multi-optima search problem in which several policy alternatives can be considered. We present here two applications exploiting EM-based exploration strategies, by considering parameterized policies based on dynamical systems, and by using Gaussian mixture models for the search of multiple policy alternatives.