COOPERATIVE LEARNING BY POLICY-SHARING IN MULTIPLE AGENTS

Authors:
Kao-Shing Hwang;Chia-Ju Lin;Chia-Yue Lo
Affiliations:
Department of Electrical Engineering, National Chung Cheng University, Chia-Yi, Taiwan;Department of Electrical Engineering, National Chung Cheng University, Chia-Yi, Taiwan;Department of Electrical Engineering, National Chung Cheng University, Chia-Yi, Taiwan
Venue:
Cybernetics and Systems
Year:
2009

Citing 7
Cited 1

Multiagent learning using a variable learning rate

Artificial Intelligence
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Adaptive policy gradient in multiagent learning

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
An Improved Multiagent Reinforcement Learning Algorithm

IAT '05 Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Technology
Rational and convergent learning in stochastic games

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Q-Learning with FCMAC in multi-agent cooperation

ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part I
Expertness based cooperative Q-learning

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Policy sharing between multiple mobile robots using decision trees

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reinforcement learning is one of the more prominent machine-learning technologies due to its unsupervised learning structure and ability to continually learn, even in a dynamic operating environment. Applying this learning to cooperative multi-agent systems not only allows each individual agent to learn from its own experience, but also offers the opportunity for the individual agents to learn from the other agents in the system so the speed of learning can be accelerated. In the proposed learning algorithm, an agent adapts to comply with its peers by learning carefully when it obtains a positive reinforcement feedback signal, but should learn more aggressively if a negative reward follows the action just taken. These two properties are applied to develop the proposed cooperative learning method. This research presents the novel use of the fastest policy hill-climbing methods of Win or Lose Fast (WoLF) with policy-sharing. Results from the multi-agent cooperative domain illustrate that the proposed algorithms perform better than Q-learning alone in a piano mover environment. It also demonstrates that agents can learn to accomplish a task together efficiently through repetitive trials.