TD-Gammon, a self-teaching backgammon program, achieves master-level play

Authors:
Gerald Tesauro
Affiliations:
-
Venue:
Neural Computation
Year:
1994

Citing 0
Cited 109

The time dimension of neural network models

ACM SIGART Bulletin
Elevator Group Control Using Multiple Reinforcement Learning Agents

Machine Learning
Learning Team Strategies: Soccer Case Studies

Machine Learning
Reinforcement learning and mistake bounded algorithms

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Learning to Play Chess Using Temporal Differences

Machine Learning
Computer Go: an AI oriented survey

Artificial Intelligence
Games solved: now and in the future

Artificial Intelligence - Chips challenging champions: games, computers and Artificial Intelligence
Technical Update: Least-Squares Temporal Difference Learning

Machine Learning
Metalearning and neuromodulation

Neural Networks - Computational models of neuromodulation
TD Models of reward predictive responses in dopamine neurons

Neural Networks - Computational models of neuromodulation
Recent Advances in Hierarchical Reinforcement Learning

Discrete Event Dynamic Systems
AI at IBM Research

IEEE Intelligent Systems
Optimal control using the transport equation: the Liouville machine

Adaptive Behavior
Learning to play strong poker

Machines that learn to play games
Unsupervised Learning in Metagame

AI '99 Proceedings of the 12th Australian Joint Conference on Artificial Intelligence: Advanced Topics in Artificial Intelligence
Sequential Decision Making Based on Direct Search

Sequence Learning - Paradigms, Algorithms, and Applications
From Simple Features to Sophisticated Evaluation Functions

CG '98 Proceedings of the First International Conference on Computers and Games
First Results from Using Temporal Difference Learning in Shogi

CG '98 Proceedings of the First International Conference on Computers and Games
Applications of the self-organising map to reinforcement learning

Neural Networks - New developments in self-organizing maps
An introduction to reinforcement learning theory: value function methods

Advanced lectures on machine learning
Exploring the predictable

Advances in evolutionary computing
Recent Advances in Hierarchical Reinforcement Learning

Discrete Event Dynamic Systems
Lyapunov design for safe reinforcement learning

The Journal of Machine Learning Research
Learning extension parameters in game-tree search

Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Heuristic search and computer game playing III
A Geometric Approach to Multi-Criterion Reinforcement Learning

The Journal of Machine Learning Research
A multi-agent system integrating reinforcement learning, bidding and genetic algorithms

Web Intelligence and Agent Systems
A Reinforcement Learning Scheme for a Partially-Observable Multi-Agent Game

Machine Learning
XCS with computed prediction in multistep environments

GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
Optimal Control Using the Transport Equation: The Liouville Machine

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Reinforcement Learning in Continuous Time and Space

Neural Computation
Relational temporal difference learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Classifier prediction based on tile coding

Proceedings of the 8th annual conference on Genetic and evolutionary computation
Book Reviews

Journal of Cognitive Neuroscience
Book Reviews

Journal of Cognitive Neuroscience
Allocating time and location information to activity-travel patterns through reinforcement learning

Knowledge-Based Systems
Evolutionary Function Approximation for Reinforcement Learning

The Journal of Machine Learning Research
Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation

Neural Computation
Application of reinforcement learning to the game of Othello

Computers and Operations Research
IFSA: incremental feature-set augmentation for reinforcement learning tasks

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
A globally optimal algorithm for TTD-MDPs

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning

Artificial Intelligence
Dynamic modeling and control of supply chain systems: A review

Computers and Operations Research
Cooperation learning in Multi-Agent Systems with annotation and reward

International Journal of Knowledge-based and Intelligent Engineering Systems
Hierarchical Average Reward Reinforcement Learning

The Journal of Machine Learning Research
Mixture of Expert Used to Learn Game Play

ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
Reinforcement Learning: Insights from Interesting Failures in Parameter Selection

Proceedings of the 10th international conference on Parallel Problem Solving from Nature: PPSN X
Fitted Natural Actor-Critic: A New Algorithm for Continuous State-Action MDPs

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Transferring Instances for Model-Based Reinforcement Learning

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
A reinforcement learning model for supply chain ordering management: An application to the beer game

Decision Support Systems
Tile Coding Based on Hyperplane Tiles

Recent Advances in Reinforcement Learning
A spiking neural network model of an actor-critic learning agent

Neural Computation
Stability of learning dynamics in two-agent, imperfect-information games

Proceedings of the tenth ACM SIGEVO workshop on Foundations of genetic algorithms
Performance Evaluation of Direct Heuristic Dynamic Programming using Control-Theoretic Measures

Journal of Intelligent and Robotic Systems
Transfer Learning and Intelligence: an Argument and Approach

Proceedings of the 2008 conference on Artificial General Intelligence 2008: Proceedings of the First AGI Conference
Reinforcement learning for games: failures and successes

Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers
Learning Representation and Control in Markov Decision Processes: New Frontiers

Foundations and Trends® in Machine Learning
Automatic heuristic construction in a complete general game player

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Automatic heuristic construction for general game playing

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Sample-efficient evolutionary function approximation for reinforcement learning

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Value functions for RL-based behavior transfer: a comparative study

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Improving action selection in MDP's via knowledge transfer

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Autonomous inter-task transfer in reinforcement learning domains

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Solving factored MDPs with hybrid state and action variables

Journal of Artificial Intelligence Research
Statistical feature combination for the evaluation of game positions

Journal of Artificial Intelligence Research
Learning to play using low-complexity rule-based policies: illustrations through Ms. Pac-Man

Journal of Artificial Intelligence Research
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
The computational complexity of probabilistic planning

Journal of Artificial Intelligence Research
Infinite-horizon policy-gradient estimation

Journal of Artificial Intelligence Research
Experiments with infinite-horizon, policy-gradient estimation

Journal of Artificial Intelligence Research
Learning and multiagent reasoning for autonomous agents

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
General game learning using knowledge transfer

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Reinforcement learning of local shape in the game of go

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Reinforcement Learning in RoboCup KeepAway with Partial Observability

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Reinforcement learning versus model predictive control: a comparison on a power system problem

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Collective intelligence in combinatorial games

ASM '07 The 16th IASTED International Conference on Applied Simulation and Modelling
Improving state evaluation, inference, and search in trick-based card games

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Adaptive state space partitioning for reinforcement learning

Engineering Applications of Artificial Intelligence
Evolution versus temporal difference learning for learning to play Ms. Pac-Man

CIG'09 Proceedings of the 5th international conference on Computational Intelligence and Games
MP-Draughts: a multiagent reinforcement learning system based on MLP and Kohonen-SOM neural networks

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
2010 Special Issue: Online learning of shaping rewards in reinforcement learning

Neural Networks
Transfer Learning for Reinforcement Learning Domains: A Survey

The Journal of Machine Learning Research
Provably Efficient Learning with Typed Parametric Models

The Journal of Machine Learning Research
RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments

The Journal of Machine Learning Research
Q-learning with linear function approximation

COLT'07 Proceedings of the 20th annual conference on Learning theory
Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning

Autonomous Agents and Multi-Agent Systems
PAC-MDP learning with knowledge-based admissible models

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Coordinated learning in multiagent MDPs with infinite state-space

Autonomous Agents and Multi-Agent Systems
Time-based reward shaping in real-time strategy games

ADMI'10 Proceedings of the 6th international conference on Agents and data mining interaction
Noisy reinforcements in reinforcement learning: some case studies based on gridworlds

ACS'06 Proceedings of the 6th WSEAS international conference on Applied computer science
Dynamic game difficulty balancing for backgammon

Proceedings of the 49th Annual Southeast Regional Conference
Integrating reinforcement learning with human demonstrations of varying ability

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
An information-theoretic analysis of return maximization in reinforcement learning

Neural Networks
Self-teaching adaptive dynamic programming for Gomoku

Neurocomputing
Automatic construction of static evaluation functions for computer game players

DS'06 Proceedings of the 9th international conference on Discovery Science
Analysis and improvement of policy gradient estimation

Neural Networks
Keepaway soccer: from machine learning testbed to benchmark

RoboCup 2005
Feature extraction for decision-theoretic planning in partially observable environments

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Analyze and guess type of piece in the computer game intelligent system

FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part II
Building Intelligent Interactive Tutors: Student-centered strategies for revolutionizing e-learning

Building Intelligent Interactive Tutors: Student-centered strategies for revolutionizing e-learning
Abstraction and generalization in reinforcement learning: a summary and framework

ALA'09 Proceedings of the Second international conference on Adaptive and Learning Agents
Non-linear Monte-Carlo search in civilization II

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Temporal difference method-based multi-step ahead prediction of long term deep fading in mobile networks

Computer Communications
HyperNEAT-GGP: a hyperNEAT-based atari general game player

Proceedings of the 14th annual conference on Genetic and evolutionary computation
A rapid sparsification method for kernel machines in approximate policy iteration

ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part I
Reinforcement learning with n-tuples on the game connect-4

PPSN'12 Proceedings of the 12th international conference on Parallel Problem Solving from Nature - Volume Part I
Sufficiency-based selection strategy for MCTS

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Monte-Carlo tree search for Bayesian reinforcement learning

Applied Intelligence
Learning via human feedback in continuous state and action spaces

Applied Intelligence
Reinforcement learning algorithms with function approximation: Recent advances and applications

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

TD-Gammon is a neural network that is able to teach itself toplay backgammon solely by playing against itself and learning fromthe results, based on the TD(») reinforcement learningalgorithm (Sutton 1988). Despite starting from random initialweights (and hence random initial strategy), TD-Gammon achieves asurprisingly strong level of play. With zero knowledge built in atthe start of learning (i.e., given only a "raw" description of theboard state), the network learns to play at a strong intermediatelevel. Furthermore, when a set of hand-crafted features is added tothe network's input representation, the result is a trulystaggering level of performance: the latest version of TD-Gammon isnow estimated to play at a strong master level that is extremelyclose to the world's best human players.