RSPSA: enhanced parameter optimization in games

Authors:
Levente Kocsis;Csaba Szepesvári;Mark H. M. Winands
Affiliations:
MTA Sztaki, Budapest, Hungary;MTA Sztaki, Budapest, Hungary;Institute for Knowledge and Agent Technology, MICC, Universiteit Maastricht, Maastricht, The Netherlands
Venue:
ACG'05 Proceedings of the 11th international conference on Advances in Computer Games
Year:
2006

Citing 6
Cited 2

Practical Issues in Temporal Difference Learning

Machine Learning
Simulation-Based Optimization with Stochastic Approximation Using Common Random Numbers

Management Science
Budget-Dependent Convergence Rate of Stochastic Approximation

SIAM Journal on Optimization
Learning extension parameters in game-tree search

Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Heuristic search and computer game playing III
Game-Tree search with adaptation in stochastic imperfect-information games

CG'04 Proceedings of the 4th international conference on Computers and Games
Evolving neural networks to play checkers without relying on expert knowledge

IEEE Transactions on Neural Networks

Universal parameter optimisation in games based on SPSA

Machine Learning
Automated discovery of search-extension features

ACG'09 Proceedings of the 12th international conference on Advances in Computer Games

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most game programs have a large number of parameters that are crucial for their performance. Tuning these parameters by hand is rather difficult. Therefore automatic optimization algorithms in game programs are interesting research domains. However, successful applications are only known for parameters that belong to certain components (e.g., evaluation-function parameters). The SPSA (Simultaneous Perturbation Stochastic Approximation) algorithm is an attractive choice for optimizing any kind of parameters of a game program, both for its generality and its simplicity. Its disadvantage is that it can be very slow. In this article we propose several methods to speed up SPSA, in particular, the combination with RPROP, using common random numbers, antithetic variables, and averaging. We test the resulting algorithm for tuning various types of parameters in two domains, Poker and LOA. From the experimental study, we may conclude that using SPSA is a viable approach for optimization in game programs, in particular if no good alternative exists for the types of parameters considered.