Least absolute policy iteration for robust value function approximation

  • Authors:
  • Masashi Sugiyama;Hirotaka Hachiya;Hisashi Kashima;Tetsuro Morimura

  • Affiliations:
  • Department of Computer Science, Tokyo Institute of Technology, Japan;Department of Computer Science, Tokyo Institute of Technology, Japan;IBM Research, Tokyo Research Laboratory, Japan;IBM Research, Tokyo Research Laboratory, Japan

  • Venue:
  • ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efficiency. However, it tends to be sensitive to outliers in observed rewards. In this paper, we propose an alternative method that employs the absolute loss for enhancing robustness and reliability. The proposed method is formulated as a linear programming problem which can be solved efficiently by standard optimization software, so the computational advantage is not sacrificed for gaining robustness and reliability. We demonstrate the usefulness of the proposed approach through simulated robot-control tasks.