A Hyperheuristic Approach to Scheduling a Sales Summit
PATAT '00 Selected papers from the Third International Conference on Practice and Theory of Automated Timetabling III
Apprenticeship learning via inverse reinforcement learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Introduction to Machine Learning (Adaptive Computation and Machine Learning)
Introduction to Machine Learning (Adaptive Computation and Machine Learning)
Proceedings of the 9th annual conference on Genetic and evolutionary computation
A comprehensive analysis of hyper-heuristics
Intelligent Data Analysis
GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartII
Policy matrix evolution for generation of heuristics
Proceedings of the 13th annual conference on Genetic and evolutionary computation
Matrix analysis of genetic programming mutation
EuroGP'12 Proceedings of the 15th European conference on Genetic Programming
HyFlex: a benchmark framework for cross-domain heuristic search
EvoCOP'12 Proceedings of the 12th European conference on Evolutionary Computation in Combinatorial Optimization
A new hyper-heuristic as a general problem solver: an implementation in HyFlex
Journal of Scheduling
Hi-index | 0.00 |
An apprenticeship-learning-based technique is used as a hyper-heuristic to generate heuristics for an online combinatorial problem. It observes and learns from the actions of a known-expert heuristic on small instances, but has the advantage of producing a general heuristic that works well on other larger instances. Specifically, we generate heuristic policies for online bin packing problem by using expert near-optimal policies produced by a hyper-heuristic on small instances, where learning is fast. The "expert" is a policy matrix that defines an index policy, and the apprenticeship learning is based on observation of the action of the expert policy together with a range of features of the bin being considered, and then applying a k-means classification. We show that the generated policy often performs better than the standard best-fit heuristic even when applied to instances much larger than the training set.