A note on genetic algorithms for large-scale feature selection
Pattern Recognition Letters
Genetic programming: on the programming of computers by means of natural selection
Genetic programming: on the programming of computers by means of natural selection
Genetic programming II: automatic discovery of reusable programs
Genetic programming II: automatic discovery of reusable programs
A compiling genetic programming system that directly manipulates the machine code
Advances in genetic programming
Genetic programming: an introduction: on the automatic evolution of computer programs and its applications
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
Genetic Programming and Evolvable Machines
Genetic Algorithms for Feature Selection and Weighting, A Review and Study
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Genetic Programming with a Genetic Algorithm for Feature Construction and Selection
Genetic Programming and Evolvable Machines
Datum-wise classification: a sequential approach to sparsity
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part I
Hi-index | 0.00 |
The goal of reinforcement learning is to find a policy that maximizes the expected reward accumulated by an agent over time based on its interactions with the environment; to this end, a function of the state of the agent has to be learned. It is often the case that states are better characterized by a set of features. However, finding a "good" set of features is generally a tedious task which requires a good domain knowledge. In this paper, we propose a genetic programming based approach for feature discovery in reinforcement learning. A population of individuals, each representing a set of features, is evolved, and individuals are evaluated by their average performance on short reinforcement learning trials. The results of experiments conducted on several benchmark problems demonstrate that the resulting features allow the agent to learn better policies in a reduced amount of episodes.