Rollout Algorithms for Combinatorial Optimization
Journal of Heuristics
Finite-time Analysis of the Multiarmed Bandit Problem
Machine Learning
Combining online and offline knowledge in UCT
Proceedings of the 24th international conference on Machine learning
Scheduling Algorithms
Bandit-based optimization on graphs with application to library performance tuning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Boosting Active Learning to Optimality: A Tractable Monte-Carlo, Billiard-Based Algorithm
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
The max K-armed bandit: a new model of exploration applied to search heuristic selection
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
SATzilla: portfolio-based algorithm selection for SAT
Journal of Artificial Intelligence Research
Learning and Intelligent Optimization
A simple distribution-free approach to the max k-armed bandit problem
CP'06 Proceedings of the 12th international conference on Principles and Practice of Constraint Programming
Bandit based monte-carlo planning
ECML'06 Proceedings of the 17th European conference on Machine Learning
Job Shop Scheduling with the Best-so-far ABC
Engineering Applications of Artificial Intelligence
Hi-index | 0.00 |
Greedy heuristics may be attuned by looking ahead for each possible choice, in an approach called the rollout or Pilot method. These methods may be seen as meta-heuristics that can enhance (any) heuristic solution, by repetitively modifying a master solution: similarly to what is done in game tree search, better choices are identified using lookahead, based on solutions obtained by repeatedly using a greedy heuristic. This paper first illustrates how the Pilot method improves upon some simple well known dispatch heuristics for the job-shop scheduling problem. The Pilot method is then shown to be a special case of the more recent Monte Carlo Tree Search (MCTS) methods: Unlike the Pilot method, MCTS methods use random completion of partial solutions to identify promising branches of the tree. The Pilot method and a simple version of MCTS, using the ε-greedy exploration paradigms, are then compared within the same framework, consisting of 300 scheduling problems of varying sizes with fixed-budget of rollouts. Results demonstrate that MCTS reaches better or same results as the Pilot methods in this context.