Partially decentralized reinforcement learning in finite, multi-agent Markov decision processes

  • Authors:
  • Omkar Tilak;Snehasis Mukhopadhyay

  • Affiliations:
  • (Correspd. E-mail: otilak@cs.iupui.edu) Department of Computer and Information Science, Indiana University-Purdue University, Indianapolis, IN, USA. E-mail: {otilak, smukhopa}@cs.iupui.edu;Department of Computer and Information Science, Indiana University-Purdue University, Indianapolis, IN, USA. E-mail: {otilak, smukhopa}@cs.iupui.edu

  • Venue:
  • AI Communications
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a novel, partially decentralized learning algorithm for the control of finite, multi-agent Markov Decision Process with unknown transition probabilities and reward values. One learning automaton is associated with each agent acting in a state and the automata acting within a state may communicate with each other. However, there is no communication between the automata present in different states, thus making the system partially decentralized. We propose novel algorithms so that the entire automata team converges to the policy that maximizes the long-term expected reward per step. Simulation results are presented to demonstrate the usefulness of the proposed algorithms.