Acting optimally in partially observable stochastic domains
AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Learning agents for uncertain environments (extended abstract)
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Planning and acting in partially observable stochastic domains
Artificial Intelligence
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Algorithms for Inverse Reinforcement Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Finite-memory control of partially observable systems
Finite-memory control of partially observable systems
Apprenticeship learning via inverse reinforcement learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Partially observable Markov decision processes for spoken dialog systems
Computer Speech and Language
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Maximum entropy inverse reinforcement learning
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Bayesian inverse reinforcement learning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Inverse Reinforcement Learning in Partially Observable Environments
The Journal of Machine Learning Research
Discrete relative states to learn and recognize goals-based behaviors of groups
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Hi-index | 0.00 |
Inverse reinforcement learning (IRL) is the problem of recovering the underlying reward function from the behaviour of an expert. Most of the existing algorithms for IRL assume that the expert's environment is modeled as a Markov decision process (MDP), although they should be able to handle partially observable settings in order to widen the applicability to more realistic scenarios. In this paper, we present an extension of the classical IRL algorithm by Ng and Russell to partially observable environments. We discuss technical issues and challenges, and present the experimental results on some of the benchmark partially observable domains.