Speeding up the convergence of value iteration in partially observable Markov decision processes

  • Authors:
  • Nevin L. Zhang;Weihong Zhang

  • Affiliations:
  • Department of Computer Science, Hong Kong University of Science & Technology, Kowloon, Hong Kong, China;Department of Computer Science, Hong Kong University of Science & Technology, Kowloon, Hong Kong, China

  • Venue:
  • Journal of Artificial Intelligence Research
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for finding optimal policies for POMDPs. It typically takes a large number of iterations to converge. This paper proposes a method for accelerating the convergence of value iteration. The method has been evaluated on an array of benchmark problems and was found to be very effective: It enabled value iteration to converge after only a few iterations on all the test problems.