Explanation-based learning: a problem solving perspective
Artificial Intelligence
COLT '89 Proceedings of the second annual workshop on Computational learning theory
Practical Issues in Temporal Difference Learning
Machine Learning
Taxonomic syntax for first order inference
Journal of the ACM (JACM)
Feature-based methods for large scale dynamic programming
Machine Learning - Special issue on reinforcement learning
Machine Learning
Learning action strategies for planning domains
Artificial Intelligence
Using temporal logics to express search control knowledge for planning
Artificial Intelligence
Stochastic dynamic programming with factored representations
Artificial Intelligence
Relational reinforcement learning
Machine Learning - Special issue on inducive logic programming
Machine Learning Methods for Planning
Machine Learning Methods for Planning
Neuro-Dynamic Programming
Using genetic programming to learn and improve control knowledge
Artificial Intelligence
Machine Learning
Learning Goal-Decomposition Rules using Exercises
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Learning Declarative Control Rules for Constraint-BAsed Planning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Dynamic Programming
Equivalence notions and model minimization in Markov decision processes
Artificial Intelligence - special issue on planning with uncertainty and incomplete information
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Exploiting first-order regression in inductive policy selection
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
The FF planning system: fast plan generation through heuristic search
Journal of Artificial Intelligence Research
Efficient solution algorithms for factored MDPs
Journal of Artificial Intelligence Research
Journal of Artificial Intelligence Research
A selective macro-learning algorithm and its application to the N × N sliding-tile puzzle
Journal of Artificial Intelligence Research
Generalizing plans to new environments in relational MDPs
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Symbolic dynamic programming for first-order MDPs
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Model minimization in Markov decision processes
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Multi-strategy learning of search control for partial-order planning
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Inductive policy selection for first-order MDPs
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Learning Control Knowledge for Forward Search Planning
The Journal of Machine Learning Research
Rollout sampling approximate policy iteration
Machine Learning
Structured machine learning: the next ten years
Machine Learning
Reinforcement Learning in Nonstationary Environment Navigation Tasks
CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Algorithms and Bounds for Rollout Sampling Approximate Policy Iteration
Recent Advances in Reinforcement Learning
The factored policy-gradient planner
Artificial Intelligence
Practical solution techniques for first-order MDPs
Artificial Intelligence
Evolutionary-based learning of generalised policies for AI planning domains
Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Learning generalized plans using abstract counting
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Engineering a conformant probabilistic planner
Journal of Artificial Intelligence Research
First order decision diagrams for relational MDPs
Journal of Artificial Intelligence Research
Online learning and exploiting relational models in reinforcement learning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Learning Linear Ranking Functions for Beam Search with Application to Planning
The Journal of Machine Learning Research
Learning relational options for inductive transfer in relational reinforcement learning
ILP'07 Proceedings of the 17th international conference on Inductive logic programming
Bounds for multistage stochastic programs using supervised learning strategies
SAGA'09 Proceedings of the 5th international conference on Stochastic algorithms: foundations and applications
Finding and transferring policies using stored behaviors
Autonomous Robots
Learning from demonstration using MDP induced metrics
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Reducing reinforcement learning to KWIK online regression
Annals of Mathematics and Artificial Intelligence
Automatic induction of bellman-error features for probabilistic planning
Journal of Artificial Intelligence Research
A new representation and associated algorithms for generalized planning
Artificial Intelligence
Planning with noisy probabilistic relational rules
Journal of Artificial Intelligence Research
Preference-based policy iteration: leveraging preference learning for reinforcement learning
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Probabilistic relational planning with first order decision diagrams
Journal of Artificial Intelligence Research
Automatic construction of efficient multiple battery usage policies
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Plan-based policies for efficient multiple battery load management
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
We study an approach to policy selection for large relational Markov Decision Processes (MDPs). We consider a variant of approximate policy iteration (API) that replaces the usual value-function learning step with a learning step in policy space. This is advantageous in domains where good policies are easier to represent and learn than the corresponding value functions, which is often the case for the relational MDPs we are interested in. In order to apply API to such problems, we introduce a relational policy language and corresponding learner. In addition, we introduce a new bootstrapping routine for goal-based planning domains, based on random walks. Such bootstrapping is necessary for many large relational MDPs, where reward is extremely sparse, as API is ineffective in such domains when initialized with an uninformed policy. Our experiments show that the resulting system is able to find good policies for a number of classical planning domains and their stochastic variants by solving them as extremely large relational MDPs. The experiments also point to some limitations of our approach, suggesting future work.