Multilayer feedforward networks are universal approximators
Neural Networks
Learning internal representations by error propagation
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Practical Issues in Temporal Difference Learning
Machine Learning
Temporal difference learning and TD-Gammon
Communications of the ACM
Co-Evolution in the Successful Learning of Backgammon Strategy
Machine Learning
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Value Function Based Production Scheduling
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
KnightCap: A Chess Programm That Learns by Combining TD(lambda) with Game-Tree Search
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Optimizing Production Manufacturing Using Reinforcement Learning
Proceedings of the Eleventh International Florida Artificial Intelligence Research Society Conference
Call admission control and routing in integrated services networks using neuro-dynamic programming
IEEE Journal on Selected Areas in Communications
Artificial Intelligence - Chips challenging champions: games, computers and Artificial Intelligence
Strategic sequential bidding in auctions using dynamic programming
Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 2
Application of reinforcement learning to the game of Othello
Computers and Operations Research
TOWARDS OPTIMIZING ENTERTAINMENT IN COMPUTER GAMES
Applied Artificial Intelligence
Player Co-Modelling in a Strategy Board Game: Discovering How to Play Fast
Cybernetics and Systems
Extending the Strada Framework to Design an AI for ORTS
ICEC '09 Proceedings of the 8th International Conference on Entertainment Computing
Introducing a round robin tournament into Blondie24
CIG'09 Proceedings of the 5th international conference on Computational Intelligence and Games
No-limit texas hold'em poker agents created with evolutionary neural networks
CIG'09 Proceedings of the 5th international conference on Computational Intelligence and Games
Efficient selectivity and backup operators in Monte-Carlo tree search
CG'06 Proceedings of the 5th international conference on Computers and games
A Markovian process modeling for Pickomino
CG'10 Proceedings of the 7th international conference on Computers and games
A new software application for backgammon based on a heuristic algorithm
ECC'11 Proceedings of the 5th European conference on European computing conference
Training neural networks to play backgammon variants using reinforcement learning
EvoApplications'11 Proceedings of the 2011 international conference on Applications of evolutionary computation - Volume Part I
ARKAQ-learning: autonomous state space segmentation and policy generation
ISCIS'05 Proceedings of the 20th international conference on Computer and Information Sciences
*-MINIMAX performance in backgammon
CG'04 Proceedings of the 4th international conference on Computers and Games
Immune based fuzzy agent plays checkers game
Applied Soft Computing
Engineering Applications of Artificial Intelligence
Baseline: practical control variates for agent evaluation in zero-sum domains
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
A tour of machine learning: An AI perspective
AI Communications - ECAI 2012 Turing and Anniversary Track
Hi-index | 0.00 |
TD-Gammon is a neural network that is able to teach itself to play backgammon solely by playing against itself and learning from the results. Starting from random initial play, TD-Gammon's self-teaching methodology results in a surprisingly strong program: without lookahead, its positional judgement rivals that of human experts, and when combined with shallow lookahead, it reaches a level of play that surpasses even the best human players. The success of TD-Gammon has also been replicated by several other programmers; at least two other neural net programs also appear to be capable of superhuman play. Previous papers on TD-Gammon have focused on developing a scientific understanding of its reinforcement learning methodology. This paper views machine learning as a tool in a programmer's toolkit, and considers how it can be combined with other programming techniques to achieve and surpass world-class backgammon play. Particular emphasis is placed on programming shallow-depth search algorithms, and on TD-Gammon's doubling algorithm, which is described in print here for the first time. Copyright 2002 Elsevier Science B.V.