Minimax TD-Learning with Neural Nets in a Markov Game

Authors:
Fredrik A. Dahl;Ole Martin Halck
Affiliations:
-;-
Venue:
ECML '00 Proceedings of the 11th European Conference on Machine Learning
Year:
2000

Citing 7
Cited 1

Numerical recipes in C: the art of scientific computing

Numerical recipes in C: the art of scientific computing
Practical Issues in Temporal Difference Learning

Machine Learning
Representations and solutions for game-theoretic problems

Artificial Intelligence - Special issue on economic principles of multi-agent systems
A unified analysis of value-function-based reinforcement learning algorithms

Neural Computation
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Poker as Testbed for AI Research

AI '98 Proceedings of the 12th Biennial Conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Learning to act using real-time dynamic programming

Artificial Intelligence

A Reinforcement Learning Algorithm Applied to Simplified Two-Player Texas Hold'em Poker

EMCL '01 Proceedings of the 12th European Conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

A minimax version of temporal difference learning (minimax TD-learning) is given, similar to minimax Q-learning. The algorithm is used to train a neural net to play Campaign, a two-player zero-sum game with imperfect information of the Markov game class. Two different evaluation criteria for evaluating game-playing agents are used, and their relation to game theory is shown. Also practical aspects of linear programming and fictitious play used for solving matrix games are discussed.