Learning automata: an introduction
Learning automata: an introduction
The dynamics of reinforcement learning in cooperative multiagent systems
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Social Agents Playing a Periodical Policy
EMCL '01 Proceedings of the 12th European Conference on Machine Learning
On No-Regret Learning, Fictitious Play, and Nash Equilibrium
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Asymmetric multiagent reinforcement learning
Web Intelligence and Agent Systems
Scientific Programming - Distributed Computing and Applications
Reaching pareto-optimality in prisoner's dilemma using conditional joint action learning
Autonomous Agents and Multi-Agent Systems
An adaptive policy gradient in learning Nash equilibria
Neurocomputing
A momentum-based approach to learning nash equilibria
PRIMA'06 Proceedings of the 9th Pacific Rim international conference on Agent Computing and Multi-Agent Systems
Hi-index | 0.00 |
Coordination is an important issue in multi-agent systems when agents want to maximize their revenue. Often coordination is achieved through communication, however communication has its price. We are interested in finding an approach where the communication between the agents is kept low, and a global optimal behavior can still be found.In this paper we report on an efficient approach that allows independent reinforcement learning agents to reach a Pareto optimal Nash equilibrium with limited communication. The communication happens at regular time steps and is basically a signal for the agents to start an exploration phase. During each exploration phase, some agents exclude their current best action so as to give the team the opportunityto look for a possibly better Nash equilibrium. This technique of reducing the action space by exclusions was only recently introduced for finding periodical policies in games of conflicting interests. Here, we explore this technique in repeated common interest games with deterministic or stochastic outcomes.