Learning classifier system with average reward reinforcement learning

Authors:
Zhaoxiang Zang;Dehua Li;Junying Wang;Dan Xia
Affiliations:
Institute for Pattern Recognition and Artificial Intelligence, Huazhong University of Science and Technology, Wuhan Hubei 430074, China;Institute for Pattern Recognition and Artificial Intelligence, Huazhong University of Science and Technology, Wuhan Hubei 430074, China;College of Computer and Information Technology, China Three Gorges University, Yichang Hubei 443000, China;Institute for Pattern Recognition and Artificial Intelligence, Huazhong University of Science and Technology, Wuhan Hubei 430074, China
Venue:
Knowledge-Based Systems
Year:
2013

Citing 27
Cited 0

A mathematical framework for studying learning in classifier systems

Physica D
Adaptive switching circuits

Neurocomputing: foundations of research
Technical Note: \cal Q-Learning

Machine Learning
Reinforcement learning algorithms for average-payoff Markovian decision processes

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Average reward reinforcement learning: foundations, algorithms, and empirical results

Machine Learning - Special issue on reinforcement learning
Model-based average reward reinforcement learning

Artificial Intelligence
Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
A Critical Review of Classifier Systems

Proceedings of the 3rd International Conference on Genetic Algorithms
An Algorithmic Description of XCS

IWLCS '00 Revised Papers from the Third International Workshop on Advances in Learning Classifier Systems
An Algorithmic Description of ACS2

IWLCS '01 Revised Papers from the 4th International Workshop on Advances in Learning Classifier Systems
Cognitive systems based on adaptive algorithms

ACM SIGART Bulletin
Toward Optimal Classifier System Performance in Non-Markov Environments

Evolutionary Computation
Standard and averaging reinforcement learning in XCS

Proceedings of the 8th annual conference on Genetic and evolutionary computation
Allocating time and location information to activity-travel patterns through reinforcement learning

Knowledge-Based Systems
Empirical analysis of generalization and learning in XCS with gradient descent

Proceedings of the 9th annual conference on Genetic and evolutionary computation
Zcs: A zeroth level classifier system

Evolutionary Computation
Classifier fitness based on accuracy

Evolutionary Computation
An analysis of generalization in the xcs classifier system

Evolutionary Computation
A learning classifier system for mazes with aliasing clones

Natural Computing: an international journal
Reinforcement learning of pedagogical policies in adaptive and intelligent educational systems

Knowledge-Based Systems
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Limits in long path learning with XCS

GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartII
Voting based learning classifier system for multi-label classification

Proceedings of the 13th annual conference companion on Genetic and evolutionary computation
Guided rule discovery in XCS for high-dimensional classification problems

AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
Gradient descent methods in learning classifier systems: improving XCS performance in multistep problems

IEEE Transactions on Evolutionary Computation
Dyna-H: A heuristic planning reinforcement learning algorithm applied to role-playing game strategy decision systems

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the family of Learning Classifier Systems, the classifier system XCS is most widely used and investigated. However, the standard XCS has difficulties solving large multi-step problems, where long action chains are needed to get delayed rewards. Up to the present, the reinforcement learning technique in XCS has been based on Q-learning, which optimizes the discounted total reward received by an agent but tends to limit the length of action chains. However, there are some undiscounted reinforcement learning methods available, such as R-learning and average reward reinforcement learning in general, which optimize the average reward per time step. In this paper, R-learning is used as the reinforcement learning employed by XCS, to replace Q-learning. The modification results in a classifier system that is rapid and able to solve large maze problems. In addition, it produces uniformly spaced payoff levels, which can support long action chains and thus effectively prevent the occurrence of overgeneralization.