Feedback of Delayed Rewards in XCS for Environments with Aliasing States

Authors:
Kuang-Yuan Chen;Peter A. Lindsay
Affiliations:
ARC Centre for Complex Systems, The University of Queensland,;ARC Centre for Complex Systems, The University of Queensland,
Venue:
ACAL '09 Proceedings of the 4th Australian Conference on Artificial Life: Borrowing from Biology
Year:
2009

Citing 8
Cited 0

Machine Learning

Machine Learning
Analysis and improvement of fitness exploitation in XCS: bounding models, tournament selection, and bilateral accuracy

Evolutionary Computation
Is XCS Suitable For Problems with Temporal Rewards?

CIMCA '05 Proceedings of the International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce Vol-2 (CIMCA-IAWTIC'06) - Volume 02
Zcs: A zeroth level classifier system

Evolutionary Computation
Classifier fitness based on accuracy

Evolutionary Computation
Learning Mazes with Aliasing States: An LCS Algorithm with Associative Perception

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Limits in long path learning with XCS

GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartII
Toward a theory of generalization and learning in XCS

IEEE Transactions on Evolutionary Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Wilson [13] showed how delayed reward feedback can be used to solve many multi-step problems for the widely used XCS learning classifier system. However, Wilson's method --- based on back-propagation with discounting from Q-learning --- runs into difficulties in environments with aliasing states, since the local reward function often does not converge. This paper describes a different approach to reward feedback, in which a layered reward scheme for XCS classifiers is learnt during training. We show that, with a relatively minor modification to XCS feedback, the approach not only solves problems such as Woods1 but can also solve aliasing states problems such as Littman57, MiyazakiA and MazeB.