Design and parametric considerations for artificial neural network pruning in UCT game playing

Authors:
Clayton Burger;Mathys C. du Plessis;Charmain B. Cilliers
Affiliations:
Nelson Mandela Metropolitan University, Port Elizabeth;Nelson Mandela Metropolitan University, Port Elizabeth;Nelson Mandela Metropolitan University, Port Elizabeth
Venue:
Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference
Year:
2013

Citing 12
Cited 0

Finite-time Analysis of the Multiarmed Bandit Problem

Machine Learning
Universal parameter optimisation in games based on SPSA

Machine Learning
Monte-Carlo simulation balancing

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Computational Intelligence: An Introduction

Computational Intelligence: An Introduction
Game Player Strategy Pattern Recognition and How UCT Algorithms Apply Pre-knowledge of Player's Strategy to Improve Opponent AI

CIMCA '08 Proceedings of the 2008 International Conference on Computational Intelligence for Modelling Control & Automation
A hybrid neural network and Minimax algorithm for zero-sum games

Proceedings of the 2009 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists
Some studies in machine learning using the game of checkers

IBM Journal of Research and Development
Pruning in UCT Algorithm

TAAI '10 Proceedings of the 2010 International Conference on Technologies and Applications of Artificial Intelligence
Bandit based monte-carlo planning

ECML'06 Proceedings of the 17th European conference on Machine Learning
The grand challenge of computer Go: Monte Carlo tree search and extensions

Communications of the ACM
Evaluation function based monte-carlo LOA

ACG'09 Proceedings of the 12th international conference on Advances in Computer Games
A phantom-go program

ACG'05 Proceedings of the 11th international conference on Advances in Computer Games

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Upper Confidence for Trees (UCT) algorithm has been shown to perform well in complex games, but samples undesirable areas of the search space when building its game tree. This paper explores the design and parametric considerations for augmenting the UCT algorithm with an Artificial Neural Network (NN) to dynamically prune the game tree created, thus limiting the game tree created. The expansion phase of UCT is augmented with a trained NN to create a novel UCT-NN variant that includes prior knowledge and strategy. This paper considers the game of Go-Moku for investigating the design and parametric considerations of UCT-NN. The parameters considered are the exploration and exploitation balancing C parameter, the NN training and structural design parameters and the various pruning schemes which could be used in UCT-NN. Parameter tuning techniques are provided for managing the parametric concerns in the proposed algorithm. Results of parameter experiments indicate that a single value of C = 1.41 is suitable for the games studied. Suitable values were found for the structural and training parameters of NN, which were required to test various pruning schemes. Of the various pruning schemes considered, an exponentially decaying scheme is found to be superior in the UCT-NN algorithm where a large amount of moves are initially pruned, but fewer moves on deeper ply.