Multiagent learning using a variable learning rate
Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Adaptive policy gradient in multiagent learning
AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
An Improved Multiagent Reinforcement Learning Algorithm
IAT '05 Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Technology
Rational and convergent learning in stochastic games
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Q-Learning with FCMAC in multi-agent cooperation
ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part I
Expertness based cooperative Q-learning
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Policy sharing between multiple mobile robots using decision trees
Information Sciences: an International Journal
Hi-index | 0.00 |
Reinforcement learning is one of the more prominent machine-learning technologies due to its unsupervised learning structure and ability to continually learn, even in a dynamic operating environment. Applying this learning to cooperative multi-agent systems not only allows each individual agent to learn from its own experience, but also offers the opportunity for the individual agents to learn from the other agents in the system so the speed of learning can be accelerated. In the proposed learning algorithm, an agent adapts to comply with its peers by learning carefully when it obtains a positive reinforcement feedback signal, but should learn more aggressively if a negative reward follows the action just taken. These two properties are applied to develop the proposed cooperative learning method. This research presents the novel use of the fastest policy hill-climbing methods of Win or Lose Fast (WoLF) with policy-sharing. Results from the multi-agent cooperative domain illustrate that the proposed algorithms perform better than Q-learning alone in a piano mover environment. It also demonstrates that agents can learn to accomplish a task together efficiently through repetitive trials.