Technical Note: \cal Q-Learning
Machine Learning
The evolution of strategies for multiagent environments
Adaptive Behavior
Learning to coordinate without sharing information
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Learning to solve Markovian decision processes
Learning to solve Markovian decision processes
Incremental evolution of complex general behavior
Adaptive Behavior - Special issue on environment structure and behavior
Multi-agent reinforcement learning in Markov games
Multi-agent reinforcement learning in Markov games
Multiagent systems: a modern approach to distributed artificial intelligence
Multiagent systems: a modern approach to distributed artificial intelligence
Elevator Group Control Using Multiple Reinforcement Learning Agents
Machine Learning
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Learning Situation-Specific Coordination in Cooperative Multi-agent Systems
Autonomous Agents and Multi-Agent Systems
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Multiagent reinforcement learning using function approximation
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Proceedings of the 2nd International Conference on Interaction Sciences: Information Technology, Culture and Human
Hi-index | 0.00 |
This paper presents a novel method for on-line coordination in multiagent reinforcement learning systems. In this method a reinforcement-learning agent learns to select its action estimating system dynamics in terms of both the natural reward for task achievement and the virtual reward for cooperation. The virtual reward for cooperation is ascertained dynamically by a coordinating agent who estimates it from the change in degree of cooperation of all agents using a separate reinforcement learning. This technique provides adaptive coordination, requires less communication and ensures agents to be cooperative. The validity of virtual rewards for convergence in learning is verified, and the proposed method is tested on two different simulated domains to illustrate its significance. The empirical performance of the coordinated system compared to the uncoordinated system illustrates its advantages for multiagent systems.