XCS with eligibility traces

  • Authors:
  • Jan Drugowitsch;Alwyn M. Barry

  • Affiliations:
  • University of Bath, Bath, UK;University of Bath, Bath, UK

  • Venue:
  • GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The development of the XCS Learning Classifier System has produced a robust and stable implementation that performs competitively in direct-reward environments. Although investigations in delayed-reward (i.e. multi-step) environments have shown promise, XCS still struggles to efficiently find optimal solutions in environments with long action-chains. This paper highlights the strong relation of XCS to reinforcement learning and identifies some of the major differences. This makes it possible to add Eligibility Traces to XCS, a method taken from reinforcement learning to update the prediction of the whole action-chain on each step, which should cause prediction update to be faster and more accurate. However, it is shown that the discrete nature of the condition representation of a classifier and the operation of the genetic algorithm cause traces to propagate back incorrect prediction values and in some cases results in a decrease of system performance. As a result further investigation of the existing approach to generalisation is proposed.