Guiding combinatorial optimization with UCT

  • Authors:
  • Ashish Sabharwal;Horst Samulowitz;Chandra Reddy

  • Affiliations:
  • IBM Watson Research Center, Yorktown Heights, NY;IBM Watson Research Center, Yorktown Heights, NY;IBM Watson Research Center, Yorktown Heights, NY

  • Venue:
  • CPAIOR'12 Proceedings of the 9th international conference on Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a new approach for search tree exploration in the context of combinatorial optimization, specifically Mixed Integer Programming (MIP), that is based on UCT, an algorithm for the multi-armed bandit problem designed for balancing exploration and exploitation in an online fashion. UCT has recently been highly successful in game tree search. We discuss the differences that arise when UCT is applied to search trees as opposed to bandits or game trees, and provide initial results demonstrating that the performance of even a highly optimized state-of-the-art MIP solver such as CPLEX can be boosted using UCT's guidance on a range of problem instances.