Temporal difference learning and TD-Gammon
Communications of the ACM
Co-Evolution in the Successful Learning of Backgammon Strategy
Machine Learning
Reinforcement Learning
Pattern Recognition and Neural Networks
Pattern Recognition and Neural Networks
Neuro-Dynamic Programming
Crossover, Macromutationand, and Population-Based Search
Proceedings of the 6th International Conference on Genetic Algorithms
Proceedings of the 1st annual conference on Genetic and evolutionary computation
GECCO '96 Proceedings of the 1st annual conference on Genetic and evolutionary computation
Coevolutionary temporal difference learning for Othello
CIG'09 Proceedings of the 5th international conference on Computational Intelligence and Games
Evolving small-board Go players using coevolutionary temporal difference learning with archives
International Journal of Applied Mathematics and Computer Science
Hi-index | 0.00 |
The neural network has been used extensively as a vehicle for both genetic algorithms and reinforcement learning. This paper shows a natural way to combine the two methods and suggests that reinforcement learning may be superior to random mutation as an engine for the discovery of useful substructures. The paper also describes a software experiment that applies this technique to produce an Othello-playing computer program. The experiment subjects a pool of Othello-playing programs to a regime of successive adaptation cycles, where each cycle consists of an evolutionary phase, based on the genetic algorithm, followed by a learning phase, based on reinforcement learning. A key idea of the genetic implementation is the concept of feature-level crossover. The regime was run for three months through 900,000 individual matches of Othello. It ultimately yielded a program that is competitive with a human-designed Othello-program that plays at roughly intermediate level.