Expected-Outcome: A General Model of Static Evaluation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Local Gain Adaptation in Stochastic Gradient Descent
Local Gain Adaptation in Stochastic Gradient Descent
Combining online and offline knowledge in UCT
Proceedings of the 24th international conference on Machine learning
Monte-Carlo simulation balancing
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Associating domain-dependent knowledge and Monte Carlo approaches within a Go program
Information Sciences: an International Journal
Efficient selectivity and backup operators in Monte-Carlo tree search
CG'06 Proceedings of the 5th international conference on Computers and games
Bandit based monte-carlo planning
ECML'06 Proceedings of the 17th European conference on Machine Learning
Adding expert knowledge and exploration in monte-carlo tree search
ACG'09 Proceedings of the 12th international conference on Advances in Computer Games
Monte-Carlo tree search and rapid action value estimation in computer Go
Artificial Intelligence
Parallel Monte-Carlo tree search for HPC systems
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
The grand challenge of computer Go: Monte Carlo tree search and extensions
Communications of the ACM
Hi-index | 0.02 |
Simulation balancing is a new technique to tune parameters of a playout policy for a Monte-Carlo game-playing program. So far, this algorithm had only been tested in a very artificial setting: it was limited to 5×5 and 6×6 Go, and required a stronger external program that served as a supervisor. In this paper, the effectiveness of simulation balancing is demonstrated in a more realistic setting. A state-of-the-art program, Erica, learned an improved playout policy on the 9×9 board, without requiring any external expert to provide position evaluations. The evaluations were collected by letting the program analyze positions by itself. The previous version of Erica learned pattern weights with the minorization-maximization algorithm. Thanks to simulation balancing, its playing strength was improved from a winning rate of 69% to 78% against Fuego 0.4.