Analysis of methods for solving MDPs

  • Authors:
  • Marek Grześ;Jesse Hoey

  • Affiliations:
  • University of Waterloo, Canada;University of Waterloo, Canada

  • Venue:
  • Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

New proofs for two extensions to value iteration are derived when the type of initialisation of the value function is considered. Theoretical requirements that guarantee the convergence of backward value iteration and weaker requirements for the convergence of backups based on best actions only are identified. Experimental results show that standard value iteration performs significantly faster with simple extensions that are investigated in this work.