Neurocomputing: foundations of research
Training agents to perform sequential behavior
Adaptive Behavior
Adding temporary memory to ZCS
Adaptive Behavior
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
A Tale of Two Classifier Systems
Machine Learning
PPSN V Proceedings of the 5th International Conference on Parallel Problem Solving from Nature
Lookahead And Latent Learning In ZCS
GECCO '02 Proceedings of the Genetic and Evolutionary Computation Conference
Biasing Exploration in an Anticipatory Learning Classifier System
IWLCS '01 Revised Papers from the 4th International Workshop on Advances in Learning Classifier Systems
Artificial Intelligence: A Modern Approach
Artificial Intelligence: A Modern Approach
Toward Optimal Classifier System Performance in Non-Markov Environments
Evolutionary Computation
Zcs: A zeroth level classifier system
Evolutionary Computation
Classifier fitness based on accuracy
Evolutionary Computation
An analysis of generalization in the xcs classifier system
Evolutionary Computation
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Planning and acting in partially observable stochastic domains
Artificial Intelligence
Toward a theory of generalization and learning in XCS
IEEE Transactions on Evolutionary Computation
Hi-index | 0.00 |
Reinforcement Learning is a learning paradigm that helps the agent to learn to act optimally in an unknown environment through trial and error. An RL-based agent senses its environmental state, proposes an action, and applies it to the environment. Then a reinforcement signal, called the reward, is sent back from the environment to the agent. The agent is expected to learn how to maximize overall environmental reward through its internal mechanisms. One of the most challenging issues in the RL area arises as a result of the sensory ability of the agent, when it is not able to sense its current environmental state completely. These environments are called partially observable environments. In these environments, the agent may fail to distinguish the actual environmental state and so may fail to propose the optimal action in particular environmental states. So an extended mechanism must be added to the architecture of the agent to enable it to perform optimally in these environments. On the other hand, one of the most-used approaches to reinforcement learning is the evolutionary learning approach and one of the most-used techniques in this family is learning classifier systems. Learning classifier systems try to evolve state-action-reward mappings to model their current environment through trial and error. In this paper we propose a new architecture for learning classifier systems that is able to perform optimally in partially observable environments. This new architecture uses a novel method to detect aliased states in the environment and disambiguates them through multiple instances of classifier systems that interact with the environment in parallel. This model is applied to some well-known benchmark problems and is compared with some of the best classifier systems proposed for these environments. Our results and detailed discussion show that our approach is one of the best techniques among other learning classifier systems in partially observable environments.