Artificial Intelligence
Fast planning through planning graph analysis
Artificial Intelligence
Finite-time Analysis of the Multiarmed Bandit Problem
Machine Learning
The FF planning system: fast plan generation through heuristic search
Journal of Artificial Intelligence Research
Learning in real-time search: a unifying framework
Journal of Artificial Intelligence Research
Bandit based monte-carlo planning
ECML'06 Proceedings of the 17th European conference on Machine Learning
Hi-index | 0.00 |
In this paper, we introduce a new heuristic search algorithm based on mean values for real-time planning, called MHSP. It consists in associating the principles of UCT, a bandit-based algorithm which gave very good results in computer games, and especially in Computer Go, with heuristic search in order to obtain a real-time planner in the context of classical planning. Compared to UCT, at leaf nodes, MHSP replaces the simulations by heuristic values given by planning graph techniques. When the heuristic is admissible, the initial mean values of nodes are optimistic, which is a correct way of guiding exploration.