Policy learning in resource-constrained optimization

Authors:
Richard Allmendinger;Joshua Knowles
Affiliations:
University of Manchester, School of Computer Science, Manchester, United Kingdom;University of Manchester, School of Computer Science, Manchester, United Kingdom
Venue:
Proceedings of the 13th annual conference on Genetic and evolutionary computation
Year:
2011

Citing 10
Cited 1

Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Evolutionary Optimization in Dynamic Environments

Evolutionary Optimization in Dynamic Environments
Finite-time Analysis of the Multiarmed Bandit Problem

Machine Learning
Uniform Crossover in Genetic Algorithms

Proceedings of the 3rd International Conference on Genetic Algorithms
Learning and anticipation in online dynamic optimization with evolutionary algorithms: the stochastic case

Proceedings of the 9th annual conference on Genetic and evolutionary computation
Evolutionary algorithms for constrained parameter optimization problems

Evolutionary Computation
Adaptive operator selection with dynamic multi-armed bandits

Proceedings of the 10th annual conference on Genetic and evolutionary computation
A reinforcement learning approach to job-shop scheduling

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Reinforcement learning for online control of evolutionary algorithms

ESOA'06 Proceedings of the 4th international conference on Engineering self-organising systems
On-line purchasing strategies for an evolutionary algorithm performing resource-constrained optimization

PPSN'10 Proceedings of the 11th international conference on Parallel problem solving from nature: Part II

On handling ephemeral resource constraints in evolutionary search

Evolutionary Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider an optimization scenario in which resources are required in the evaluation process of candidate solutions. The challenge we are focussing on is that certain resources have to be committed to for some period of time whenever they are used by an optimizer. This has the effect that certain solutions may be temporarily non-evaluable during the optimization. Previous analysis revealed that evolutionary algorithms (EAs) can be effective against this resourcing issue when augmented with static strategies for dealing with non-evaluable solutions, such as repairing, waiting, or penalty methods. Moreover, it is possible to select a suitable strategy for resource-constrained problems offline if the resourcing issue is known in advance. In this paper we demonstrate that an EA that uses a reinforcement learning (RL) agent, here Sarsa(λ), to learn offline when to switch between static strategies, can be more effective than any of the static strategies themselves. We also show that learning the same task as the RL agent but online using an adaptive strategy selection method, here D-MAB, is not as effective; nevertheless, online learning is an alternative to static strategies.