ML92 Proceedings of the ninth international workshop on Machine learning
Taxonomic syntax for first order inference
Journal of the ACM (JACM)
Learning action strategies for planning domains
Artificial Intelligence
Machine Learning
Learning Declarative Control Rules for Constraint-BAsed Planning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A logical measure of progress for planning
Eighteenth national conference on Artificial intelligence
Learning to fly by combining reinforcement learning with behavioural cloning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
The FF planning system: fast plan generation through heuristic search
Journal of Artificial Intelligence Research
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Inductive policy selection for first-order MDPs
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Learning Control Knowledge for Forward Search Planning
The Journal of Machine Learning Research
Practical solution techniques for first-order MDPs
Artificial Intelligence
The first probabilistic track of the international planning competition
Journal of Artificial Intelligence Research
Discriminative learning of beam-search heuristics for planning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Using learned policies in heuristic-search planning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
RECYCLE: Learning looping workflows from annotated traces
ACM Transactions on Intelligent Systems and Technology (TIST)
Hi-index | 0.00 |
We study an approach to learning heuristics for planning domains from example solutions. There has been little work on learning heuristics for the types of domains used in deterministic and stochastic planning competitions. Perhaps one reason for this is the challenge of providing a compact heuristic language that facilitates learning. Here we introduce a new representation for heuristics based on lists of set expressions described using taxonomic syntax. Next, we review the idea of a measure of progress (parmar 2002), which is any heuristic that is guaranteed to be improvable at every state. We take finding a measure of progress as our learning goal, and describe a simple learning algorithm for this purpose. We evaluate our approach across a range of deterministic and stochastic planning-competition domains. The results show that often greedily following the learned heuristic is highly effective. We also show our heuristic can be combined with learned rule-based policies, producing still stronger results.