Multiagent reinforcement learning algorithm using temporal difference error

Authors:
SeungGwan Lee
Affiliations:
School of Computer Science and Information Engineering, Catholic University, Bucheon-Si, Gyeonggi-Do, Korea
Venue:
ISNN'05 Proceedings of the Second international conference on Advances in Neural Networks - Volume Part I
Year:
2005

Citing 4
Cited 3

Efficient reinforcement learning

COLT '94 Proceedings of the seventh annual conference on Computational learning theory
A Study of Some Properties of Ant-Q

PPSN IV Proceedings of the 4th International Conference on Parallel Problem Solving from Nature
Ant colony system: a cooperative learning approach to the traveling salesman problem

IEEE Transactions on Evolutionary Computation
Ant system: optimization by a colony of cooperating agents

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

On the efficient implementation biologic reinforcement learning using eligibility traces

ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part I
A cooperation online reinforcement learning approach in ant-q

ICONIP'06 Proceedings of the 13 international conference on Neural Information Processing - Volume Part I
Efficient ant reinforcement learning using replacing eligibility traces

ICAISC'06 Proceedings of the 8th international conference on Artificial Intelligence and Soft Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

When agent chooses some action and does state transition in present state in reinforcement learning, it is important subject to decide how will reward for conduct that agent chooses. In this paper, by new meta heuristic method to solve hard combinatorial optimization problems, we introduce Ant-Q learning method that has been proposed to solve Traveling Salesman Problem (TSP) to approach that is based for population that use positive feedback as well as greedy search, and suggest ant reinforcement learning model using TD-error(ARLM-TDE). We could know through an experiment that proposed reinforcement learning method converges faster to optimal solution than original ACS and Ant-Q.