Autonomous shaping via coevolutionary selection of training experience

  • Authors:
  • Marcin Szubert;Krzysztof Krawiec

  • Affiliations:
  • Institute of Computing Science, Poznan University of Technology, Poznań, Poland;Institute of Computing Science, Poznan University of Technology, Poznań, Poland

  • Venue:
  • PPSN'12 Proceedings of the 12th international conference on Parallel Problem Solving from Nature - Volume Part II
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

To acquire expert skills in a sequential decision making domain that is too vast to be explored thoroughly, an intelligent agent has to be capable of inducing crucial knowledge from the most representative parts of it. One way to shape the learning process and guide the learner in the right direction is effective selection of such parts that provide the best training experience. To realize this concept, we propose a shaping method that orchestrates the training by iteratively exposing the learner to subproblems generated autonomously from the original problem. The main novelty of the proposed approach consists in equalling the learning process with the search in subproblem space and in employing a coevolutionary algorithm to perform this search. Each individual in the population encodes a sequence of subproblems that is evaluated by confronting the learner trained on it with other learners shaped in this way by particular individuals. When applied to the game of Othello, temporal difference learning on the best found subproblem sequence yields substantially better players than learning on the entire problem at once.