The dynamics of reinforcement learning in cooperative multiagent systems
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Multiagent learning using a variable learning rate
Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Nash Convergence of Gradient Dynamics in General-Sum Games
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Reinforcement Learning in POMDP's via Direct Gradient Ascent
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Agendas for multi-agent learning
Artificial Intelligence
The Dynamics of Multi-Agent Reinforcement Learning
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Artificial intelligence in robocup
Reasoning, Action and Interaction in AI Theories and Systems
Multi-agent case-based reasoning for cooperative reinforcement learners
ECCBR'06 Proceedings of the 8th European conference on Advances in Case-Based Reasoning
Adaptive multi-robot team reconfiguration using a policy-reuse reinforcement learning approach
AAMAS'11 Proceedings of the 10th international conference on Advanced Agent Technology
Hi-index | 0.00 |
Multi-robot learning faces all of the challenges of robot learning with all of the challenges of multiagent learning. There has been a great deal of recent research on multiagent reinforcement learning in stochastic games, which is the intuitive extension of MDPs to multiple agents. This recent work, although general, has only been applied to small games with at most hundreds of states. On the other hand robot tasks have continuous, and often complex, state and action spaces. Robot learning tasks demand approximation and generalization techniques, which have only received extensive attention in single-agent learning. In this paper we introduce GraWoLF, a general-purpose, scalable, multiagent learning algorithm. It combines gradient-based policy learning techniques with the WoLF ("Win or Learn Fast") variable learning rate. We apply this algorithm to an adversarial multi-robot task with simultaneous learning. We show results of learning both in simulation and on the real robots. These results demonstrate that GraWoLF can learn successful policies, overcoming the many challenges in multi-robot learning.