Functional value iteration for decision-theoretic planning with general utility functions

  • Authors:
  • Yaxin Liu;Sven Koenig

  • Affiliations:
  • Department of Computer Sciences, University of Texas at Austin, Austin, Tx;Computer Science Department, University of Southern California, Los Angeles, CA

  • Venue:
  • AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study how to find plans that maximize the expected total utility for a given MDP, a planning objective that is important for decision making in high-stakes domains. The optimal actions can now depend on the total reward that has been accumulated so far in addition to the current state. We extend our previous work on functional value iteration from one-switch utility functions to all utility functions that can be approximated with piecewise linear utility functions (with and without exponential tails) by using functional value iteration to find a plan that maximizes the expected total utility for the approximate utility function. Functional value iteration does not maintain a value for every state but a value function that maps the total reward that has been accumulated so far into a value. We describe how functional value iteration represents these value functions in finite form, how it performs dynamic programming by manipulating these representations and what kinds of approximation guarantees it is able to make. We also apply it to a probabilistic blocksworld problem, a standard test domain for decision-theoretic planners.