Upper confidence trees with short term partial information

Authors:
Olivier Teytaud;Sébastien Flory
Affiliations:
TAO, Lri, Inria Saclay-IDF, UMR, CNRS, Université Paris-Sud;Boostr
Venue:
EvoApplications'11 Proceedings of the 2011 international conference on Applications of evolutionary computation - Volume Part I
Year:
2011

Citing 8
Cited 1

The complexity of Markov decision processes

Mathematics of Operations Research
Complexity of finite-horizon Markov decision process problems

Journal of the ACM (JACM)
Gambling in a rigged casino: The adversarial multi-armed bandit problem

FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
On the undecidability of probabilistic planning and related stochastic optimization problems

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Games, Puzzles, and Computation

Games, Puzzles, and Computation
Efficient selectivity and backup operators in Monte-Carlo tree search

CG'06 Proceedings of the 5th international conference on Computers and games
Bandit based monte-carlo planning

ECML'06 Proceedings of the 17th European conference on Machine Learning
A sublinear-time randomized approximation algorithm for matrix games

Operations Research Letters

Using double-oracle method and serialized alpha-beta search for pruning in simultaneous move games

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We show some mathematical links between partially observable (PO) games in which information is regularly revealed, and simultaneous actions games. Using this, we study the extension of Monte-Carlo Tree Search algorithms to PO games and to games with simultaneous actions. We apply the results to Urban Rivals, a free PO internet card game with more than 10 millions of registered users.