Epoch-Incremental Queue-Dyna Algorithm

  • Authors:
  • Roman Zajdel

  • Affiliations:
  • Faculty of Electrical and Computer Engineering, Rzeszow University of Technology, Rzeszow, Poland 35-959

  • Venue:
  • ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The basic reinforcement learning algorithm, as Q-learning, is characterized by short time-consuming single learning step, however, the number of epochs necessary to achieve the optimal policy is not satisfactory. There are many methods that reduce the number of necessary epochs, like TD(茂戮驴 0), Dyna or prioritized sweeping, but their learning time is considerable. This paper proposes a combination of Q-learning algorithm performed in incremental mode with executed in epoch mode method of acceleration based on environment model and distance to terminal state. This approach ensures the maintenance of short time of a single learning step and high efficiency comparable with Dyna or prioritized sweeping. Proposed algorithm is compared with Q(茂戮驴)-learning, Dyna-Q and prioritized sweeping in the experiments on three maze tasks. The time-consuming learning process and number of epochs necessary to reach the terminal state is used to evaluate the efficiency of compared algorithms.