Apprenticeship learning and reinforcement learning with application to robotic control

  • Authors:
  • Pieter Abbeel

  • Affiliations:
  • Stanford University

  • Venue:
  • Apprenticeship learning and reinforcement learning with application to robotic control
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Many problems in robotics have unknown, stochastic, high-dimensional, and highly nonlinear dynamics, and offer significant challenges to both traditional control methods and reinforcement learning algorithms. Some of the key difficulties that arise in these problems are: (i) It is often difficult to write down, in closed form, a formal specification of the control task. For example, what is the objective function for "flying well"? (ii) It is often difficult to build a good dynamics model because of both data collection and data modeling challenges (similar to the "exploration problem" in reinforcement learning). (iii) It is often computationally expensive to find closed-loop controllers for high dimensional, stochastic domains. We describe learning algorithms with formal performance guarantees which show that these problems can be efficiently addressed in the apprenticeship learning setting—the setting when expert demonstrations of the task are available. Our algorithms are guaranteed to return a control policy with performance comparable to the expert's. We evaluate performance on the same task and in the same (typically stochastic, high-dimensional and non-linear) environment as the expert. Besides having theoretical guarantees, our algorithms have also enabled us to solve some previously unsolved real-world control problems: They have enabled a quadruped robot to traverse challenging, previously unseen terrain. They have significantly extended the state-of-the-art in autonomous helicopter flight. Our helicopter has performed by far the most challenging aerobatic maneuvers performed by any autonomous helicopter to date, including maneuvers such as continuous in-place flips, rolls and tic-tocs, which only exceptional expert human pilots can fly. Our aerobatic flight performance is comparable to that of the best human pilots.