Optimal convergence in multi-agent MDPs

  • Authors:
  • Peter Vrancx;Katja Verbeeck;Ann Nowé

  • Affiliations:
  • Computational Modeling Lab, Vrije Universiteit Brussel;MICC-IKAT, Maastricht University;Computational Modeling Lab, Vrije Universiteit Brussel

  • Venue:
  • KES'07/WIRN'07 Proceedings of the 11th international conference, KES 2007 and XVII Italian workshop on neural networks conference on Knowledge-based intelligent information and engineering systems: Part III
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Learning Automata (LA) were recently shown to be valuable tools for designing Multi-Agent Reinforcement Learning algorithms. One of the principal contributions of LA theory is that a set of decentralized, independent learning automata is able to control a finite Markov Chain with unknown transition probabilities and rewards. We extend this result to the framework of Multi-Agent MDP's, a straightforward extension of single-agent MDP's to distributed cooperative multi-agent decision problems. Furthermore, we combine this result with the application of parametrized learning automata yielding global optimal convergence results.