AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Algorithms for Inverse Reinforcement Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A POMDP formulation of preference elicitation problems
Eighteenth national conference on Artificial intelligence
Apprenticeship learning via inverse reinforcement learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Preference learning with Gaussian processes
ICML '05 Proceedings of the 22nd international conference on Machine learning
Multi-task reinforcement learning: a hierarchical Bayesian approach
Proceedings of the 24th international conference on Machine learning
Learning for control from multiple demonstrations
Proceedings of the 25th international conference on Machine learning
Bayesian inverse reinforcement learning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Multi-Agent Inverse Reinforcement Learning
ICMLA '10 Proceedings of the 2010 Ninth International Conference on Machine Learning and Applications
Inverse Reinforcement Learning in Partially Observable Environments
The Journal of Machine Learning Research
Preference elicitation and inverse reinforcement learning
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Robust bayesian reinforcement learning through tight lower bounds
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Robust bayesian reinforcement learning through tight lower bounds
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Hi-index | 0.00 |
We generalise the problem of inverse reinforcement learning to multiple tasks, from multiple demonstrations. Each one may represent one expert trying to solve a different task, or as different experts trying to solve the same task. Our main contribution is to formalise the problem as statistical preference elicitation, via a number of structured priors, whose form captures our biases about the relatedness of different tasks or expert policies. In doing so, we introduce a prior on policy optimality, which is more natural to specify. We show that our framework allows us not only to learn to efficiently from multiple experts but to also effectively differentiate between the goals of each. Possible applications include analysing the intrinsic motivations of subjects in behavioural experiments and learning from multiple teachers.