A New Architecture for Learning Classifier Systems to Solve POMDP Problems

Authors:
Ali Hamzeh;Adel Rahmani
Affiliations:
(Correspd.) Computer Engineering Department Iran University of Science and Technology, Teheran, Iran. E-mail: hamzeh@iust.ac.ir/ rahmani@iust.ac.ir;Computer Engineering Department Iran University of Science and Technology, Teheran, Iran. E-mail: hamzeh@iust.ac.ir/ rahmani@iust.ac.ir
Venue:
Fundamenta Informaticae
Year:
2008

Citing 17
Cited 0

Adaptive switching circuits

Neurocomputing: foundations of research
Training agents to perform sequential behavior

Adaptive Behavior
Adding temporary memory to ZCS

Adaptive Behavior
Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems

Computation & intelligence
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
A Tale of Two Classifier Systems

Machine Learning
A Corporate Classifier System

PPSN V Proceedings of the 5th International Conference on Parallel Problem Solving from Nature
Lookahead And Latent Learning In ZCS

GECCO '02 Proceedings of the Genetic and Evolutionary Computation Conference
Biasing Exploration in an Anticipatory Learning Classifier System

IWLCS '01 Revised Papers from the 4th International Workshop on Advances in Learning Classifier Systems
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
Toward Optimal Classifier System Performance in Non-Markov Environments

Evolutionary Computation
Zcs: A zeroth level classifier system

Evolutionary Computation
Classifier fitness based on accuracy

Evolutionary Computation
An analysis of generalization in the xcs classifier system

Evolutionary Computation
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Toward a theory of generalization and learning in XCS

IEEE Transactions on Evolutionary Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reinforcement Learning is a learning paradigm that helps the agent to learn to act optimally in an unknown environment through trial and error. An RL-based agent senses its environmental state, proposes an action, and applies it to the environment. Then a reinforcement signal, called the reward, is sent back from the environment to the agent. The agent is expected to learn how to maximize overall environmental reward through its internal mechanisms. One of the most challenging issues in the RL area arises as a result of the sensory ability of the agent, when it is not able to sense its current environmental state completely. These environments are called partially observable environments. In these environments, the agent may fail to distinguish the actual environmental state and so may fail to propose the optimal action in particular environmental states. So an extended mechanism must be added to the architecture of the agent to enable it to perform optimally in these environments. On the other hand, one of the most-used approaches to reinforcement learning is the evolutionary learning approach and one of the most-used techniques in this family is learning classifier systems. Learning classifier systems try to evolve state-action-reward mappings to model their current environment through trial and error. In this paper we propose a new architecture for learning classifier systems that is able to perform optimally in partially observable environments. This new architecture uses a novel method to detect aliased states in the environment and disambiguates them through multiple instances of classifier systems that interact with the environment in parallel. This model is applied to some well-known benchmark problems and is compared with some of the best classifier systems proposed for these environments. Our results and detailed discussion show that our approach is one of the best techniques among other learning classifier systems in partially observable environments.