Guiding combinatorial optimization with UCT

Authors:
Ashish Sabharwal;Horst Samulowitz;Chandra Reddy
Affiliations:
IBM Watson Research Center, Yorktown Heights, NY;IBM Watson Research Center, Yorktown Heights, NY;IBM Watson Research Center, Yorktown Heights, NY
Venue:
CPAIOR'12 Proceedings of the 9th international conference on Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems
Year:
2012

Citing 8
Cited 0

Integer and combinatorial optimization

Integer and combinatorial optimization
Finite-time Analysis of the Multiarmed Bandit Problem

Machine Learning
Combining online and offline knowledge in UCT

Proceedings of the 24th international conference on Machine learning
Simulation-based approach to general game playing

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 1
Achieving master level play in 9×9 computer go

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Monte Carlo tree search techniques in the game of Kriegspiel

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Applying UCT to boolean satisfiability

SAT'11 Proceedings of the 14th international conference on Theory and application of satisfiability testing
Bandit based monte-carlo planning

ECML'06 Proceedings of the 17th European conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a new approach for search tree exploration in the context of combinatorial optimization, specifically Mixed Integer Programming (MIP), that is based on UCT, an algorithm for the multi-armed bandit problem designed for balancing exploration and exploitation in an online fashion. UCT has recently been highly successful in game tree search. We discuss the differences that arise when UCT is applied to search trees as opposed to bandits or game trees, and provide initial results demonstrating that the performance of even a highly optimized state-of-the-art MIP solver such as CPLEX can be boosted using UCT's guidance on a range of problem instances.