EDA-RL: estimation of distribution algorithms for reinforcement learning problems

Authors:
Hisashi Handa
Affiliations:
Okayama University, Okayama, Japan
Venue:
Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Year:
2009

Citing 8
Cited 4

Technical Note: \cal Q-Learning

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation

Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Using a Markov network model in a univariate EDA: an empirical cost-benefit analysis

GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
Estimation of Distribution Algorithms with Kikuchi Approximations

Evolutionary Computation
Studying XCS/BOA learning in Boolean functions: structure encoding and random Boolean functions

Proceedings of the 8th annual conference on Genetic and evolutionary computation
Fda -a scalable evolutionary algorithm for the optimization of additively decomposed functions

Evolutionary Computation

Use of infeasible individuals in probabilistic model building genetic network programming

Proceedings of the 13th annual conference on Genetic and evolutionary computation
A Markovianity based optimisation algorithm

Genetic Programming and Evolvable Machines
A novel classification learning framework based on estimation of distribution algorithms

International Journal of Computing Science and Mathematics
A survey of multi-objective sequential decision-making

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

By making use of probabilistic models, (EDAs) can outperform conventional evolutionary computations. In this paper, EDAs are extended to solve reinforcement learning problems which arise naturally in a framework for autonomous agents. In reinforcement learning problems, we have to find out better policies of agents such that the rewards for agents in the future are increased. In general, such a policy can be represented by conditional probabilities of the agents' actions, given the perceptual inputs. In order to estimate such a conditional probability distribution, Conditional Random Fields (CRFs) by Lafferty et al. is newly introduced into EDAs in this paper. The reason for adopting CRFs is that CRFs are able to learn conditional probabilistic distributions from a large amount of input-output data, i.e., episodes in the case of reinforcement learning problems. On the other hand, conventional reinforcement learning algorithms can only learn incrementally. Computer simulations of Probabilistic Transition Problems and Perceptual Aliasing Maze Problems show the effectiveness of EDA-RL.