Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation
Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation
Theory of Computing Systems
Combining online and offline knowledge in UCT
Proceedings of the 24th international conference on Machine learning
Bandit-based optimization on graphs with application to library performance tuning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Monte-Carlo simulation balancing
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Monte-Carlo exploration for deterministic planning
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Search lessons learned from crossword puzzles
AAAI'90 Proceedings of the eighth National conference on Artificial intelligence - Volume 1
UCD: Upper Confidence Bound for Rooted Directed Acyclic Graphs
TAAI '10 Proceedings of the 2010 International Conference on Technologies and Applications of Artificial Intelligence
Nested Monte-Carlo Search with AMAF Heuristic
TAAI '10 Proceedings of the 2010 International Conference on Technologies and Applications of Artificial Intelligence
Optimization of the nested Monte-Carlo algorithm on the traveling salesman problem with time windows
EvoApplications'11 Proceedings of the 2011 international conference on Applications of evolutionary computation - Volume Part II
A Monte-Carlo AIXI approximation
Journal of Artificial Intelligence Research
LION'12 Proceedings of the 6th international conference on Learning and Intelligent Optimization
Investigating monte-carlo methods on the weak schur problem
EvoCOP'13 Proceedings of the 13th European conference on Evolutionary Computation in Combinatorial Optimization
Hi-index | 0.00 |
Monte Carlo tree search (MCTS) methods have had recent success in games, planning, and optimization. MCTS uses results from rollouts to guide search; a rollout is a path that descends the tree with a randomized decision at each ply until reaching a leaf. MCTS results can be strongly influenced by the choice of appropriate policy to bias the rollouts. Most previous work on MCTS uses static uniform random or domain-specific policies. We describe a new MCTS method that dynamically adapts the rollout policy during search, in deterministic optimization problems. Our starting point is Cazenave's original Nested Monte Carlo Search (NMCS), but rather than navigating the tree directly we instead use gradient ascent on the rollout policy at each level of the nested search. We benchmark this new Nested Rollout Policy Adaptation (NRPA) algorithm and examine its behavior. Our test problems are instances of Crossword Puzzle Construction and Morpion Solitaire. Over moderate time scales NRPA can substantially improve search efficiency compared to NMCS, and over longer time scales NRPA improves upon all previous published solutions for the test problems. Results include a new Morpion Solitaire solution that improves upon the previous human-generated record that had stood for over 30 years.