Swarm Reinforcement Learning Algorithm Based on Particle Swarm Optimization Whose Personal Bests Have Lifespans

  • Authors:
  • Hitoshi Iima;Yasuaki Kuroe

  • Affiliations:
  • Kyoto Institute of Technology, Kyoto, Japan;Kyoto Institute of Technology, Kyoto, Japan

  • Venue:
  • ICONIP '09 Proceedings of the 16th International Conference on Neural Information Processing: Part II
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We recently proposed a swarm reinforcement learning algorithm based on particle swarm optimization (PSO) in order to find optimal policies rapidly. In this algorithm, multiple agents are prepared, and they learn not only by individual learning but also by an update procedure of PSO. In this procedure, state-action values are updated based on the personal best and the global best which are found by the agents so far. In this paper, we direct our attention to a problem that overvaluing personal bests brings inferior learning performance. In order not to update the state-action values based on the overvalued personal best, we propose a swarm reinforcement learning algorithm based on PSO in which the personal best of each agent has a lifespan.