Multiagent reinforcement learning algorithm using temporal difference error

  • Authors:
  • SeungGwan Lee

  • Affiliations:
  • School of Computer Science and Information Engineering, Catholic University, Bucheon-Si, Gyeonggi-Do, Korea

  • Venue:
  • ISNN'05 Proceedings of the Second international conference on Advances in Neural Networks - Volume Part I
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

When agent chooses some action and does state transition in present state in reinforcement learning, it is important subject to decide how will reward for conduct that agent chooses. In this paper, by new meta heuristic method to solve hard combinatorial optimization problems, we introduce Ant-Q learning method that has been proposed to solve Traveling Salesman Problem (TSP) to approach that is based for population that use positive feedback as well as greedy search, and suggest ant reinforcement learning model using TD-error(ARLM-TDE). We could know through an experiment that proposed reinforcement learning method converges faster to optimal solution than original ACS and Ant-Q.