Multiple overlapping tiles for contextual monte carlo tree search

Authors:
Arpad Rimmel;Fabien Teytaud
Affiliations:
TAO (Inria), LRI, UMR 8623(CNRS - Univ. Paris-Sud), Orsay, France;TAO (Inria), LRI, UMR 8623(CNRS - Univ. Paris-Sud), Orsay, France
Venue:
EvoApplicatons'10 Proceedings of the 2010 international conference on Applications of Evolutionary Computation - Volume Part I
Year:
2010

Citing 8
Cited 1

Heuristics: intelligent search strategies for computer problem solving

Heuristics: intelligent search strategies for computer problem solving
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Finite-time Analysis of the Multiarmed Bandit Problem

Machine Learning
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Bandit-based optimization on graphs with application to library performance tuning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Bandit based monte-carlo planning

ECML'06 Proceedings of the 17th European conference on Machine Learning
Function approximation via tile coding: automating parameter choice

SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
Creating an upper-confidence-tree program for havannah

ACG'09 Proceedings of the 12th international conference on Advances in Computer Games

Investigating monte-carlo methods on the weak schur problem

EvoCOP'13 Proceedings of the 13th European conference on Evolutionary Computation in Combinatorial Optimization

Quantified Score

Hi-index	0.00

Visualization

Abstract

Monte Carlo Tree Search is a recent algorithm that achieves more and more successes in various domains. We propose an improvement of the Monte Carlo part of the algorithm by modifying the simulations depending on the context. The modification is based on a reward function learned on a tiling of the space of Monte Carlo simulations. The tiling is done by regrouping the Monte Carlo simulations where two moves have been selected by one player. We show that it is very efficient by experimenting on the game of Havannah.