Temporal difference learning applied to a high-performance game-playing program

  • Authors:
  • Jonathan Schaeffer;Markian Hlynka;Vili Jussila

  • Affiliations:
  • Department of Computing Science, University of Alberta, Edmonton, Canada;Department of Computing Science, University of Alberta, Edmonton, Canada;Laboratory of Computational Engineering, Helsinki University of Technology, Helsinki, Finland

  • Venue:
  • IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

The temporal difference (TD) learning algorithm offers the hope that the arduous task of manually tuning the evaluation function weights of game-playing programs can be automated. With one exception (TD-Gammon), TD learning has not been demonstrated to be effictive in a high-performance, world Class game-palying program. Further, there has been doubt expressed by game-program developers that learned weights could compete with the best hand-tuned weights. Chinook is the World Man-Machine tuned over 5 years. This paper shows that TD learinng is capable of competing with the best human effort.