Generic rank-one corrections for value iteration in Markovian decision problems

  • Authors:
  • Dimitri P. Bertsekas

  • Affiliations:
  • Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA 02139, USA

  • Venue:
  • Operations Research Letters
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given a linear iteration of the form x := F(x), we consider modified versions of the form x := F(x + @cd), where d is a fixed direction, and @c is chosen to minimize the norm of the residual @?x + @cd - F(x + @cd)@?. We propose ways to choose d so that the convergence rate of the modified iteration is governed by the subdominant eigenvalue of the original. In the special case where F relates to a Markovian decision problem, we obtain a new extrapolation method for value iteration. In particular, our method accelerates the Gauss-Seidel version of the value iteration method for discounted problems in the same way that MacQueen's error bounds accelerate the standard version. Furthermore, our method applies equally well to Markov Renewal and undiscounted problems.