The History Heuristic and Alpha-Beta Search Enhancements in Practice
IEEE Transactions on Pattern Analysis and Machine Intelligence
Finite-time Analysis of the Multiarmed Bandit Problem
Machine Learning
General Game Playing with Ants
SEAL '08 Proceedings of the 7th International Conference on Simulated Evolution and Learning
Heuristic evaluation functions for general game playing
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Fluxplayer: a successful general game player
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Simulation-based approach to general game playing
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 1
General game learning using knowledge transfer
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Reinforcement learning of local shape in the game of go
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Bandit based monte-carlo planning
ECML'06 Proceedings of the 17th European conference on Machine Learning
General Game Playing with Ants
SEAL '08 Proceedings of the 7th International Conference on Simulated Evolution and Learning
Intelligent agents for the game of go
IEEE Computational Intelligence Magazine
Revisiting Monte-Carlo tree search on a normal form game: NoGo
EvoApplications'11 Proceedings of the 2011 international conference on Applications of evolutionary computation - Volume Part I
Hi-index | 0.00 |
General Game Playing (GGP) aims at developing game playing agents that are able to play a variety of games and, in the absence of pre-programmed game specific knowledge, become proficient players. Most GGP players have used standard tree-search techniques enhanced by automatic heuristic learning. The UCT algorithm, a simulation-based tree search, is a new approach and has been used successfully in GGP. However, it relies heavily on random simulations to assign values to unvisited nodes and selecting nodes for descending down a tree. This can lead to slower convergence times in UCT. In this paper, we discuss the generation and evolution of domain-independent knowledge using both state and move patterns. This is then used to guide the simulations in UCT. In order to test the improvements, we create matches between a player using standard the UCT algorithm and one using UCT enhanced with knowledge.