Optimistic-Pessimistic Q-Learning Algorithm for Multi-Agent Systems

  • Authors:
  • Natalia Akchurina

  • Affiliations:
  • International Graduate School of Dynamic Intelligent Systems, University of Paderborn, Paderborn, Germany 33098

  • Venue:
  • MATES '08 Proceedings of the 6th German conference on Multiagent System Technologies
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

A reinforcement learning algorithm OP-Qfor multi-agent systems based on Hurwicz's optimistic-pessimistic criterion which allows to embed preliminary knowledge on the degree of environment friendliness is proposed. The proof of its convergence to stationary policy is given. Thorough testing of the developed algorithm against well-known reinforcement learning algorithms has shown that OP-Qcan function on the level of its opponents.