TD-Gammon, a self-teaching backgammon program, achieves master-level play
Neural Computation
Learning to Predict by the Methods of Temporal Differences
Machine Learning
KnightCap: A Chess Programm That Learns by Combining TD(lambda) with Game-Tree Search
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Game playing (invited talk): the next moves
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Chess Neighborhoods, Function Combination, and Reinforcement Learning
CG '00 Revised Papers from the Second International Conference on Computers and Games
Hi-index | 0.00 |
This paper describes first results from the application of Temporal Difference learning [1] to shogi. We report on experiments to determine whether sensible values for shogi pieces can be obtained in the same manner as for western chess pieces [2]. The learning is obtained entirely from randomised self-play, without access to any form of expert knowledge. The piece values are used in a simple search program that chooses shogi moves from a shallow lookahead, using pieces values to evaluate the leaves, with a random tie-break at the top level. Temporal difference learning is used to adjust the piece values over the course of a series of games. The method is successful in learning values that perform well in matches against hand-crafted values.