Decentralized Learning in Markov Games
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Learning the global maximum with parameterized learning automata
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
This paper provides a novel approach to multi-agent coordination in general sum Markov games. Contrary to what is common in multi-agent learning, our approach does not focus on reaching a particular equilibrium between agent policies. Instead, it learns a basis set of special joint agent policies, over which it can randomize to build different solutions. The main idea is to tackle a Markov game by decomposing it into a set of multi-agent common interest problems; each reflecting one agent's preferences in the system. With only a minimum of coordination, simple reinforcement learning agents using Parameterised Learning Automata are able to solve this set of common interest problems in parallel. As a result, a team of simple learning agents becomes able to switch play between desired joint policies rather than mixing individual policies.