ML92 Proceedings of the ninth international workshop on Machine learning
Learning action strategies for planning domains
Artificial Intelligence
Programming by example (introduction)
Communications of the ACM
Learning Search Control Knowledge: An Explanation-Based Approach
Learning Search Control Knowledge: An Explanation-Based Approach
Scaling Reinforcement Learning toward RoboCup Soccer
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Algorithms for Inverse Reinforcement Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Training conditional random fields via gradient tree boosting
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Apprenticeship learning via inverse reinforcement learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Exploiting first-order regression in inductive policy selection
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
The relationship between Precision-Recall and ROC curves
ICML '06 Proceedings of the 23rd international conference on Machine learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
Top-down induction of first order logical decision trees
AI Communications
Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)
Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)
Non-parametric policy gradients: a unified treatment of propositional and relational domains
Proceedings of the 25th international conference on Machine learning
Boosting Relational Sequence Alignments
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
A survey of robot learning from demonstration
Robotics and Autonomous Systems
The first probabilistic track of the international planning competition
Journal of Artificial Intelligence Research
Symbolic dynamic programming for first-order MDPs
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Relational macros for transfer in reinforcement learning
ILP'07 Proceedings of the 17th international conference on Inductive logic programming
Robot Programming by Demonstration
Robot Programming by Demonstration
BAGGER: an EBL system that extends and generalizes explanations
AAAI'87 Proceedings of the sixth National conference on Artificial intelligence - Volume 2
Inductive policy selection for first-order MDPs
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
TildeCRF: conditional random fields for logical sequences
ECML'06 Proceedings of the 17th European conference on Machine Learning
Hi-index | 0.00 |
Imitation learning refers to the problem of learning how to behave by observing a teacher in action. We consider imitation learning in relational domains, in which there is a varying number of objects and relations among them. In prior work, simple relational policies are learned by viewing imitation learning as supervised learning of a function from states to actions. For propositional worlds, functional gradient methods have been proved to be beneficial. They are simpler to implement than most existing methods, more efficient, more naturally satisfy common constraints on the cost function, and better represent our prior beliefs about the form of the function. Building on recent generalizations of functional gradient boosting to relational representations, we implement a functional gradient boosting approach to imitation learning in relational domains. In particular, given a set of traces from the human teacher, our system learns a policy in the form of a set of relational regression trees that additively approximate the functional gradients. The use of multiple additive trees combined with relational representation allows for learning more expressive policies than what has been done before. We demonstrate the usefulness of our approach in several different domains.