The History Heuristic and Alpha-Beta Search Enhancements in Practice
IEEE Transactions on Pattern Analysis and Machine Intelligence
Practical Issues in Temporal Difference Learning
Machine Learning
TD-Gammon, a self-teaching backgammon program, achieves master-level play
Neural Computation
Best-first fixed-depth minimax algorithms
Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
Computers, Chess, and Cognition
Computers, Chess, and Cognition
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Machines that learn to play games
Optimizing parameter learning using temporal differences
Eighteenth national conference on Artificial intelligence
Learning extension parameters in game-tree search
Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Heuristic search and computer game playing III
Learning long-term chess strategies from databases
Machine Learning
Universal parameter optimisation in games based on SPSA
Machine Learning
Genetic algorithms for mentor-assisted evaluation function optimization
Proceedings of the 10th annual conference on Genetic and evolutionary computation
Using abstraction in Two-Player Games
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Simulating human grandmasters: evolution and coevolution of evaluation functions
Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Visualization and adjustment of evaluation functions based on evaluation values and win probability
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Infinite-horizon policy-gradient estimation
Journal of Artificial Intelligence Research
Experiments with infinite-horizon, policy-gradient estimation
Journal of Artificial Intelligence Research
An Intelligent Agent That Autonomously Learns How to Translate
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Temporal difference learning applied to a high-performance game-playing program
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Evolution and incremental learning in the iterated prisoner's dilemma
IEEE Transactions on Evolutionary Computation
PPSN'10 Proceedings of the 11th international conference on Parallel problem solving from nature: Part II
Expert-driven genetic algorithms for simulating evaluation functions
Genetic Programming and Evolvable Machines
A methodology for learning players| styles from game records
International Journal of Artificial Intelligence and Soft Computing
Self-teaching adaptive dynamic programming for Gomoku
Neurocomputing
Automatic construction of static evaluation functions for computer game players
DS'06 Proceedings of the 9th international conference on Discovery Science
N-learning: a reinforcement learning paradigm for multiagent systems
AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Design with shape grammars and reinforcement learning
Advanced Engineering Informatics
Monte-Carlo tree search for Bayesian reinforcement learning
Applied Intelligence
Learning via human feedback in continuous state and action spaces
Applied Intelligence
An intelligent Web agent that autonomously learns how to translate
Web Intelligence and Agent Systems
Hi-index | 0.00 |
In this paper we present TDLEAF(λ), a variation on the TD(λ) algorithm that enables it to be used in conjunction with game-tree search. We present some experiments in which our chess program “KnightCap” used TDLEAF(λ) to learn its evaluation function while playing on Internet chess servers. The main success we report is that KnightCap improved from a 1650 rating to a 2150 rating in just 308 games and 3 days of play. As a reference, a rating of 1650 corresponds to about level B human play (on a scale from E (1000) to A (1800)), while 2150 is human master level. We discuss some of the reasons for this success, principle among them being the use of on-line, rather than self-play. We also investigate whether TDLEAF(λ) can yield better results in the domain of backgammon, where TD(λ) has previously yielded striking success.