Inverse reinforcement learning in partially observable environments

  • Authors:
  • Jaedeug Choi;Kee-Eung Kim

  • Affiliations:
  • Department of Computer Science, Korea Advanced Institute of Science and Technology, Daejeon, Korea;Department of Computer Science, Korea Advanced Institute of Science and Technology, Daejeon, Korea

  • Venue:
  • IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Inverse reinforcement learning (IRL) is the problem of recovering the underlying reward function from the behaviour of an expert. Most of the existing algorithms for IRL assume that the expert's environment is modeled as a Markov decision process (MDP), although they should be able to handle partially observable settings in order to widen the applicability to more realistic scenarios. In this paper, we present an extension of the classical IRL algorithm by Ng and Russell to partially observable environments. We discuss technical issues and challenges, and present the experimental results on some of the benchmark partially observable domains.