Inverse reinforcement learning in partially observable environments

Authors:
Jaedeug Choi;Kee-Eung Kim
Affiliations:
Department of Computer Science, Korea Advanced Institute of Science and Technology, Daejeon, Korea;Department of Computer Science, Korea Advanced Institute of Science and Technology, Daejeon, Korea
Venue:
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Year:
2009

Citing 11
Cited 3

Acting optimally in partially observable stochastic domains

AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Learning agents for uncertain environments (extended abstract)

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Algorithms for Inverse Reinforcement Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Finite-memory control of partially observable systems

Finite-memory control of partially observable systems
Apprenticeship learning via inverse reinforcement learning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Partially observable Markov decision processes for spoken dialog systems

Computer Speech and Language
Point-based policy iteration

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Maximum entropy inverse reinforcement learning

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Bayesian inverse reinforcement learning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence

Inverse Reinforcement Learning in Partially Observable Environments

The Journal of Machine Learning Research
Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs

Artificial Intelligence
Discrete relative states to learn and recognize goals-based behaviors of groups

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Inverse reinforcement learning (IRL) is the problem of recovering the underlying reward function from the behaviour of an expert. Most of the existing algorithms for IRL assume that the expert's environment is modeled as a Markov decision process (MDP), although they should be able to handle partially observable settings in order to widen the applicability to more realistic scenarios. In this paper, we present an extension of the classical IRL algorithm by Ng and Russell to partially observable environments. We discuss technical issues and challenges, and present the experimental results on some of the benchmark partially observable domains.