Acting optimally in partially observable stochastic domains
AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Learning agents for uncertain environments (extended abstract)
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Algorithms for Inverse Reinforcement Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Finite-memory control of partially observable systems
Finite-memory control of partially observable systems
Apprenticeship learning via inverse reinforcement learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Heuristic search value iteration for POMDPs
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Learning structured prediction models: a large margin approach
ICML '05 Proceedings of the 22nd international conference on Machine learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
Partially observable Markov decision processes for spoken dialog systems
Computer Speech and Language
Probabilistic planning for robotic exploration
Probabilistic planning for robotic exploration
Apprenticeship learning using linear programming
Proceedings of the 25th international conference on Machine learning
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Maximum entropy inverse reinforcement learning
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Perseus: randomized point-based value iteration for POMDPs
Journal of Artificial Intelligence Research
Anytime point-based approximations for large POMDPs
Journal of Artificial Intelligence Research
Bayesian inverse reinforcement learning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Planning and acting in partially observable stochastic domains
Artificial Intelligence
Inverse reinforcement learning in partially observable environments
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Training parsers by inverse reinforcement learning
Machine Learning
Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework
IEEE Journal on Selected Areas in Communications
Bayesian multitask inverse reinforcement learning
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Hi-index | 0.00 |
Inverse reinforcement learning (IRL) is the problem of recovering the underlying reward function from the behavior of an expert. Most of the existing IRL algorithms assume that the environment is modeled as a Markov decision process (MDP), although it is desirable to handle partially observable settings in order to handle more realistic scenarios. In this paper, we present IRL algorithms for partially observable environments that can be modeled as a partially observable Markov decision process (POMDP). We deal with two cases according to the representation of the given expert's behavior, namely the case in which the expert's policy is explicitly given, and the case in which the expert's trajectories are available instead. The IRL in POMDPs poses a greater challenge than in MDPs since it is not only ill-posed due to the nature of IRL, but also computationally intractable due to the hardness in solving POMDPs. To overcome these obstacles, we present algorithms that exploit some of the classical results from the POMDP literature. Experimental results on several benchmark POMDP domains show that our work is useful for partially observable settings.