Genetic algorithms with sharing for multimodal function optimization
Proceedings of the Second International Conference on Genetic Algorithms on Genetic algorithms and their application
Learning in embedded systems
Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Finite-time Analysis of the Multiarmed Bandit Problem
Machine Learning
Evolving neural networks through augmenting topologies
Evolutionary Computation
The Vision of Autonomic Computing
Computer
Learning Classifier Systems, From Foundations to Applications
Learning Classifier Systems, From Foundations to Applications
Practical Reinforcement Learning in Continuous Spaces
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Averaging Efficiently in the Presence of Noise
PPSN V Proceedings of the 5th International Conference on Parallel Problem Solving from Nature
Utility Functions in Autonomic Systems
ICAC '04 Proceedings of the First International Conference on Autonomic Computing
An autonomous explore/exploit strategy
GECCO '05 Proceedings of the 7th annual workshop on Genetic and evolutionary computation
Evolutionary Function Approximation for Reinforcement Learning
The Journal of Machine Learning Research
A comparison between cellular encoding and direct encoding for genetic neural networks
GECCO '96 Proceedings of the 1st annual conference on Genetic and evolutionary computation
Competitive coevolution through evolutionary complexification
Journal of Artificial Intelligence Research
Bandit problems and the exploration/exploitation tradeoff
IEEE Transactions on Evolutionary Computation
Rational Bidding Using Reinforcement Learning
GECON '08 Proceedings of the 5th international workshop on Grid Economics and Business Models
Q-Strategy: A Bidding Strategy for Market-Based Allocation of Grid Services
OTM '08 Proceedings of the OTM 2008 Confederated International Conferences, CoopIS, DOA, GADA, IS, and ODBASE 2008. Part I on On the Move to Meaningful Internet Systems:
On-line neuroevolution applied to the open racing car simulator
CEC'09 Proceedings of the Eleventh conference on Congress on Evolutionary Computation
Kernel-based online NEAT for keepaway soccer
LSMS'07 Proceedings of the Life system modeling and simulation 2007 international conference on Bio-Inspired computational intelligence and applications
Hi-index | 0.01 |
In reinforcement learning, an agent interacting with its environment strives to learn a policy that specifies, for each state it may encounter, what action to take. Evolutionary computation is one of the most promising approaches to reinforcement learning but its success is largely restricted to off-line scenarios. In on-line scenarios, an agent must strive to maximize the reward it accrues while it is learning. Temporal difference (TD) methods, another approach to reinforcement learning, naturally excel in on-line scenarios because they have selection mechanisms for balancing the need to search for better policies exploration) with the need to accrue maximal reward (exploitation). This paper presents a novel way to strike this balance in evolutionary methods by borrowing the selection mechanisms used by TD methods to choose individual actions and using them in evolution to choose policies for evaluation. Empirical results in the mountain car and server job scheduling domains demonstrate that these techniques can substantially improve evolution's on-line performance in stochastic domains.