Feedback of Delayed Rewards in XCS for Environments with Aliasing States

  • Authors:
  • Kuang-Yuan Chen;Peter A. Lindsay

  • Affiliations:
  • ARC Centre for Complex Systems, The University of Queensland,;ARC Centre for Complex Systems, The University of Queensland,

  • Venue:
  • ACAL '09 Proceedings of the 4th Australian Conference on Artificial Life: Borrowing from Biology
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Wilson [13] showed how delayed reward feedback can be used to solve many multi-step problems for the widely used XCS learning classifier system. However, Wilson's method --- based on back-propagation with discounting from Q-learning --- runs into difficulties in environments with aliasing states, since the local reward function often does not converge. This paper describes a different approach to reward feedback, in which a layered reward scheme for XCS classifiers is learnt during training. We show that, with a relatively minor modification to XCS feedback, the approach not only solves problems such as Woods1 but can also solve aliasing states problems such as Littman57, MiyazakiA and MazeB.