Finite-time Analysis of the Multiarmed Bandit Problem
Machine Learning
The Distributed Constraint Satisfaction Problem: Formalization and Algorithms
IEEE Transactions on Knowledge and Data Engineering
Beyond NP: Arc-Consistency for Quantified Constraints
CP '02 Proceedings of the 8th International Conference on Principles and Practice of Constraint Programming
The computational complexity of quantified constraint satisfaction
The computational complexity of quantified constraint satisfaction
Simulation-based approach to general game playing
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 1
QCSP-solve: a solver for quantified constraint satisfaction problems
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Solution directed backjumping for QCSP
CP'07 Proceedings of the 13th international conference on Principles and practice of constraint programming
Efficient selectivity and backup operators in Monte-Carlo tree search
CG'06 Proceedings of the 5th international conference on Computers and games
Realtime online solving of quantified CSPs
CP'09 Proceedings of the 15th international conference on Principles and practice of constraint programming
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Bandit based monte-carlo planning
ECML'06 Proceedings of the 17th European conference on Machine Learning
Dynamic multiagent load balancing using distributed constraint optimization techniques
Web Intelligence and Agent Systems
Hi-index | 0.00 |
We develop a real-time algorithm based on a Monte-Carlo game tree search for solving a quantified constraint satisfaction problem (QCSP), which is a CSP where some variables are universally quantified. A universally quantified variable represents a choice of nature or an adversary. The goal of a QCSP is to make a robust plan against an adversary. However, obtaining a complete plan off-line is intractable when the size of the problem becomes large. Thus, we need to develop a realtime algorithmthat sequentially selects a promising value at each deadline. Such a problem has been considered in the field of game tree search. In a standard game tree search algorithm, developing a good static evaluation function is crucial. However, developing a good static evaluation function for a QCSP is very difficult since it must estimate the possibility that a partially assigned QCSP is solvable. Thus, we apply a Monte-Carlo game tree search technique called UCT. However, the simple application of the UCT algorithm does not work since the player and the adversary are asymmetric, i.e., finding a game sequence where the player wins is very rare. We overcome this difficulty by introducing constraint propagation techniques. We experimentally compare the winning probability of our UCT-based algorithm and the state-of-the-art alpha-beta search algorithm. Our results show that our algorithm outperforms the state-of-the-art algorithm in large-scale problems.