On the scalability of parallel UCT

Authors:
Richard B. Segal
Affiliations:
IBM Research, Yorktown Heights, NY
Venue:
CG'10 Proceedings of the 7th international conference on Computers and games
Year:
2010

Citing 6
Cited 4

Finite-time Analysis of the Multiarmed Bandit Problem

Machine Learning
Combining online and offline knowledge in UCT

Proceedings of the 24th international conference on Machine learning
Parallel Monte-Carlo Tree Search

CG '08 Proceedings of the 6th international conference on Computers and Games
A Parallel Monte-Carlo Tree Search Algorithm

CG '08 Proceedings of the 6th international conference on Computers and Games
Bandit based monte-carlo planning

ECML'06 Proceedings of the 17th European conference on Machine Learning
A lock-free multithreaded monte-carlo tree search algorithm

ACG'09 Proceedings of the 12th international conference on Advances in Computer Games

Parallel Monte-Carlo tree search for HPC systems

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
Parallel monte carlo tree search scalability discussion

AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
Playing repeated Stackelberg games with unknown opponents

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Monte-Carlo tree search parallelisation for computer go

Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

The parallelization of MCTS across multiple-machines has proven surprisingly difficult. The limitations of existing algorithms were evident in the 2009 Computer Olympiad where ZEN using a single fourcore machine defeated both Fuego with ten eight-core machines, and Mogo with twenty thirty-two core machines. This paper investigates the limits of parallel MCTS in order to understand why distributed parallelism has proven so difficult and to pave the way towards future distributed algorithms with better scaling. We first analyze the single-threaded scaling of Fuego and find that there is an upper bound on the play-quality improvements which can come from additional search. We then analyze the scaling of an idealized N-core shared memory machine to determine the maximum amount of parallelism supported by MCTS. We show that parallel speedup depends critically on how much time is given to each player. We use this relationship to predict parallel scaling for time scales beyond what can be empirically evaluated due to the immense computation required. Our results show that MCTS can scale nearly perfectly to at least 64 threads when combined with virtual loss, but without virtual loss scaling is limited to just eight threads. We also find that for competition time controls scaling to thousands of threads is impossible not necessarily due to MCTS not scaling, but because high levels of parallelism can start to bump up against the upper performance bound of FUEGO itself.