Evolving small-board Go players using coevolutionary temporal difference learning with archives

Authors:
Krzysztof Krawiec;Wojciech Jaśkowski;Marcin Szubert
Affiliations:
Institute of Computing Science, Poznań University of Technology, ul. Piotrowo 2, 60-965 Poznań, Poland;Institute of Computing Science, Poznań University of Technology, ul. Piotrowo 2, 60-965 Poznań, Poland;Institute of Computing Science, Poznań University of Technology, ul. Piotrowo 2, 60-965 Poznań, Poland
Venue:
International Journal of Applied Mathematics and Computer Science
Year:
2011

Citing 27
Cited 3

Temporal difference learning and TD-Gammon

Communications of the ACM
Genetic algorithms + data structures = evolution programs (3rd ed.)

Genetic algorithms + data structures = evolution programs (3rd ed.)
Co-Evolution in the Successful Learning of Backgammon Strategy

Machine Learning
Learning to evaluate Go positions via temporal difference methods

Computational intelligence in games
Computer Go: an AI oriented survey

Artificial Intelligence
Blondie24: playing at the edge of AI

Blondie24: playing at the edge of AI
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Co-evolving a Neural-Net Evaluation Function for Othello by Combining Genetic Algorithms and Reinforcement Learning

ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Competitive Environments Evolve Better Solutions for Complex Tasks

Proceedings of the 5th International Conference on Genetic Algorithms
Solution concepts in coevolutionary algorithms

Solution concepts in coevolutionary algorithms
The MaxSolve algorithm for coevolution

GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
GP-Gammon: Genetically Programming Backgammon Players

Genetic Programming and Evolvable Machines
Coevolution of neural networks using a layered pareto archive

Proceedings of the 8th annual conference on Genetic and evolutionary computation
A Monotonic Archive for Pareto-Coevolution

Evolutionary Computation
New methods for competitive coevolution

Evolutionary Computation
Emergent geometric organization and informative dimensions in coevolutionary algorithms

Emergent geometric organization and informative dimensions in coevolutionary algorithms
Evolving strategy for a probabilistic game of imperfect information using genetic programming

Genetic Programming and Evolvable Machines
Why Coevolution Doesn't "Work": Superiority and Progress in Coevolution

EuroGP '09 Proceedings of the 12th European Conference on Genetic Programming
Reinforcement learning of local shape in the game of go

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Some studies in machine learning using the game of checkers

IBM Journal of Research and Development
Coevolutionary temporal difference learning for Othello

CIG'09 Proceedings of the 5th international conference on Computational Intelligence and Games
A game-theoretic memory mechanism for coevolution

GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartI
Evolution of an efficient search algorithm for the mate-in-N problem in chess

EuroGP'07 Proceedings of the 10th European conference on Genetic programming
Winning ant wars: evolving a human-competitive game strategy using fitnessless selection

EuroGP'08 Proceedings of the 11th European conference on Genetic programming
Coevolution versus self-play temporal difference learning for acquiring position evaluation in small-board go

IEEE Transactions on Evolutionary Computation
Real-time neuroevolution in the NERO video game

IEEE Transactions on Evolutionary Computation

Improving coevolution by random sampling

Proceedings of the 15th annual conference on Genetic and evolutionary computation
Shaping fitness function for evolutionary learning of game strategies

Proceedings of the 15th annual conference on Genetic and evolutionary computation
Quantitative analysis of the hall of fame coevolutionary archives

Proceedings of the 15th annual conference companion on Genetic and evolutionary computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

We apply Coevolutionary Temporal Difference Learning (CTDL) to learn small-board Go strategies represented as weighted piece counters. CTDL is a randomized learning technique which interweaves two search processes that operate in the intra-game and inter-game mode. Intra-game learning is driven by gradient-descent Temporal Difference Learning (TDL), a reinforcement learning method that updates the board evaluation function according to differences observed between its values for consecutively visited game states. For the inter-game learning component, we provide a coevolutionary algorithm that maintains a sample of strategies and uses the outcomes of games played between them to iteratively modify the probability distribution, according to which new strategies are generated and added to the sample. We analyze CTDL's sensitivity to all important parameters, including the trace decay constant that controls the lookahead horizon of TDL, and the relative intensity of intra-game and inter-game learning. We also investigate how the presence of memory (an archive) affects the search performance, and find out that the archived approach is superior to other techniques considered here and produces strategies that outperform a handcrafted weighted piece counter strategy and simple liberty-based heuristics. This encouraging result can be potentially generalized not only to other strategy representations used for small-board Go, but also to various games and a broader class of problems, because CTDL is generic and does not rely on any problem-specific knowledge.