Multiagent learning using a variable learning rate
Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Adaptive policy gradient in multiagent learning
AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Rational and convergent learning in stochastic games
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Q-Learning with FCMAC in multi-agent cooperation
ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part I
Expertness based cooperative Q-learning
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
An extension of a hierarchical reinforcement learning algorithm for multiagent settings
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Hi-index | 0.00 |
In human society, learning is essential to intelligent behavior. However, people do not need to learn everything from scratch by their own discovery. Instead, they exchange information and knowledge with one another and learn from their peers and teachers. When a task is too complex for an individual to handle, one may cooperate with its partners in order to accomplish it. Like human society, cooperation exists in the other species, such as ants that are known to communicate about the locations of food and move it cooperatively. Using the experience and knowledge of other agents, a learning agent may learn faster, make fewer mistakes, and create rules for unstructured situations. In the proposed learning algorithm, an agent adapts to comply with its peers by learning carefully when it obtains a positive reinforcement feedback signal, but should learn more aggressively if a negative reward follows the action just taken. These two properties are applied to develop the proposed cooperative learning method conceptually. The algorithm is implemented in some cooperative tasks and demonstrates that agents can learn to accomplish a task together efficiently through a repetitive trials.