Top-down induction of first-order logical decision trees
Artificial Intelligence
Tree based discretization for continuous state space reinforcement learning
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Artificial Intelligence
Relational reinforcement learning
Machine Learning - Special issue on inducive logic programming
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
Coordinated Reinforcement Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Reinforcement learning with selective perception and hidden state
Reinforcement learning with selective perception and hidden state
SIAM Journal on Control and Optimization
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Training conditional random fields via gradient tree boosting
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Integrating Guidance into Relational Reinforcement Learning
Machine Learning
Learning decisions: robustness, uncertainty, and approximation
Learning decisions: robustness, uncertainty, and approximation
Tree-Based Batch Mode Reinforcement Learning
The Journal of Machine Learning Research
Combining model-based and instance-based learning for first order regression
ICML '05 Proceedings of the 22nd international conference on Machine learning
Policy Gradient in Continuous Time
The Journal of Machine Learning Research
Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)
Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)
Experiments with infinite-horizon, policy-gradient estimation
Journal of Artificial Intelligence Research
First order decision diagrams for relational MDPs
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
TildeCRF: conditional random fields for logical sequences
ECML'06 Proceedings of the 17th European conference on Machine Learning
ECML'05 Proceedings of the 16th European conference on Machine Learning
Seeing the forest despite the trees: large scale spatial-temporal decision making
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Exploration in relational worlds
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Automatic induction of bellman-error features for probabilistic planning
Journal of Artificial Intelligence Research
Planning with noisy probabilistic relational rules
Journal of Artificial Intelligence Research
Preference-based policy iteration: leveraging preference learning for reinforcement learning
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Imitation learning in relational domains: a functional-gradient boosting approach
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Exploration in relational domains for model-based reinforcement learning
The Journal of Machine Learning Research
Hi-index | 0.00 |
Policy gradient approaches are a powerful instrument for learning how to interact with the environment. Existing approaches have focused on propositional and continuous domains only. Without extensive feature engineering, it is difficult - if not impossible - to apply them within structured domains, in which e.g. there is a varying number of objects and relations among them. In this paper, we describe a non-parametric policy gradient approach - called NPPG - that overcomes this limitation. The key idea is to apply Friedmann's gradient boosting: policies are represented as a weighted sum of regression models grown in an stage-wise optimization. Employing off-the-shelf regression learners, NPPG can deal with propositional, continuous, and relational domains in a unified way. Our experimental results show that it can even improve on established results.