Orchestrating multiagent learning of penalty games

Authors:
Ana L. C. Bazzan
Affiliations:
PPGC / Instituto de Informática, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, RS, Brazil
Venue:
SBIA'12 Proceedings of the 21st Brazilian conference on Advances in Artificial Intelligence
Year:
2012

Citing 7
Cited 0

The dynamics of reinforcement learning in cooperative multiagent systems

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Coordinated Reinforcement Learning

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Reinforcement learning of coordination in cooperative multi-agent systems

Eighteenth national conference on Artificial intelligence
Integrating organizational control into multi-agent learning

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
As Safe As It Gets: Near-Optimal Learning in Multi-Stage Games with Imperfect Monitoring

Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In comparison to single agent learning, reinforcement learning in a multiagent scenario is more challenging, since there is an increase in the space of combination of actions that may have to be explored before agents learn an efficient policy. Among other approaches, there has been a proposition to address this problem by means of biasing the exploration. We follow this track using an organizational structure where low-level agents mainly use reinforcement learning, while also getting recommendations from agents possessing a broader view. These agents keep a base of cases in order to give such recommendations, orchestrating the process. We show that this approach is able to accelerate and improve learning in penalty games, a especial case of coordination games.