Neural Approximation of Monte Carlo Policy Evaluation Deployed in Connect Four

Authors:
Stefan Faußer;Friedhelm Schwenker
Affiliations:
Institute of Neural Information Processing, University of Ulm, Ulm, Germany 89069;Institute of Neural Information Processing, University of Ulm, Ulm, Germany 89069
Venue:
ANNPR '08 Proceedings of the 3rd IAPR workshop on Artificial Neural Networks in Pattern Recognition
Year:
2008

Citing 4
Cited 0

Temporal difference learning and TD-Gammon

Communications of the ACM
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
High-order and multilayer perceptron initialization

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

To win a board-game or more generally to gain something specific in a given Markov-environment, it is most important to have a policy in choosing and taking actions that leads to one of several qualitative good states. In this paper we describe a novel method to learn a game-winning strategy. The method predicts statistical probabilities to win in given game states using a state-value function that is approximated by a Multi-layer perceptron. Those predictions will improve according to rewards given in terminal states. We have deployed that method in the game Connect Four and have compared its game-performance with Velena [5].