Non-linear Monte-Carlo search in civilization II

Authors:
S. R. K. Branavan;David Silver;Regina Barzilay
Affiliations:
Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology;Department of Computer Science, University College London;Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology
Venue:
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Year:
2011

Citing 12
Cited 2

Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters

Advances in neural information processing systems 2
TD-Gammon, a self-teaching backgammon program, achieves master-level play

Neural Computation
Using probabilistic knowledge and simulation to play poker

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
World-championship-caliber Scrabble

Artificial Intelligence - Chips challenging champions: games, computers and Artificial Intelligence
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Combining online and offline knowledge in UCT

Proceedings of the 24th international conference on Machine learning
Sample-based learning and search with permanent and transient memories

Proceedings of the 25th international conference on Machine learning
An Analysis of UCT in Multi-player Games

CG '08 Proceedings of the 6th international conference on Computers and Games
UCT for tactical assault planning in real-time strategy games

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Efficient selectivity and backup operators in Monte-Carlo tree search

CG'06 Proceedings of the 5th international conference on Computers and games
High-level reinforcement learning in strategy games

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Learning to win by reading manuals in a Monte-Carlo framework

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1

Learning to win by reading manuals in a Monte-Carlo framework

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Learning to win by reading manuals in a monte-carlo framework

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a new Monte-Carlo search algorithm for very large sequential decision-making problems. We apply non-linear regression within Monte-Carlo search, online, to estimate a state-action value function from the outcomes of random roll-outs. This value function generalizes between related states and actions, and can therefore provide more accurate evaluations after fewer rollouts. A further significant advantage of this approach is its ability to automatically extract and leverage domain knowledge from external sources such as game manuals. We apply our algorithm to the game of Civilization II, a challenging multiagent strategy game with an enormous state space and around 1021 joint actions. We approximate the value function by a neural network, augmented by linguistic knowledge that is extracted automatically from the official game manual. We show that this non-linear value function is significantly more efficient than a linear value function, which is itself more efficient than Monte-Carlo tree search. Our non-linear Monte-Carlo search wins over 78% of games against the built-in AI of Civilization II.