TD-Gammon, a self-teaching backgammon program, achieves master-level play
Neural Computation
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Neural Connect 4 - A Connectionist Approach to the Game
SBRN '02 Proceedings of the VII Brazilian Symposium on Neural Networks (SBRN'02)
Temporal credit assignment in reinforcement learning
Temporal credit assignment in reinforcement learning
Reinforcement Learning: Insights from Interesting Failures in Parameter Selection
Proceedings of the 10th international conference on Parallel Problem Solving from Nature: PPSN X
Pattern recognition and reading by machine
IRE-AIEE-ACM '59 (Eastern) Papers presented at the December 1-3, 1959, eastern joint IRE-AIEE-ACM computer conference
Reinforcement learning for games: failures and successes
Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers
Some studies in machine learning using the game of checkers
IBM Journal of Research and Development
Learning n-tuple networks for othello by coevolutionary gradient search
Proceedings of the 13th annual conference on Genetic and evolutionary computation
Hi-index | 0.00 |
Learning complex game functions is still a difficult task. We apply temporal difference learning (TDL), a well-known variant of the reinforcement learning approach, in combination with n-tuple networks to the game Connect-4. Our agent is trained just by self-play. It is able, for the first time, to consistently beat the optimal-playing Minimax agent (in game situations where a win is possible). The n-tuple network induces a mighty feature space: It is not necessary to design certain features, but the agent learns to select the right ones. We believe that the n-tuple network is an important ingredient for the overall success and identify several aspects that are relevant for achieving high-quality results. The architecture is sufficiently general to be applied to similar reinforcement learning tasks as well.