Functional value iteration for decision-theoretic planning with general utility functions

Authors:
Yaxin Liu;Sven Koenig
Affiliations:
Department of Computer Sciences, University of Texas at Austin, Austin, Tx;Computer Science Department, University of Southern California, Los Angeles, CA
Venue:
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Year:
2006

Citing 18
Cited 3

Trial by fire: understanding the design requirements for agents in complex environments

AI Magazine
The convergence rate of the sandwich algorithm for approximating convex functions

Computing
Stochastic shortest path problems with piecewise-linear concave utility functions

Management Science
Optimal paths in graphs with stochastic or multidimensional weights

Communications of the ACM
LAO: a heuristic search algorithm that finds solutions with loops

Artificial Intelligence - Special issue on heuristic search in artificial intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
An Autonomous Spacecraft Agent Prototype

Autonomous Robots - Special issue on autonomous agents
Decision-Theoretic Control of Planetary Rovers

Revised Papers from the International Seminar on Advances in Plan-Based Control of Robotic Agents,
Sensor Planning with Non-linear Utility Functions

ECP '99 Proceedings of the 5th European Conference on Planning: Recent Advances in AI Planning
Cooperative Multiobjective Decision Support for the Paper Industry

Interfaces
Piecewise linear value function approximation for factored MDPs

Eighteenth national conference on Artificial intelligence
A decision-support system for quote generation

Eighteenth national conference on Artificial intelligence
Fitting and Compilation of Multiagent Models through Piecewise Linear Functions

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Decision-theoretic planning under risk-sensitive planning objectives

Decision-theoretic planning under risk-sensitive planning objectives
Risk-sensitive planning with one-switch utility functions: value iteration

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Lazy approximation for solving continuous finite-horizon MDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
SPUDD: stochastic planning using decision diagrams

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Brief On terminating Markov decision processes with a risk-averse objective function

Automatica (Journal of IFAC)

Unknown rewards in finite-horizon domains

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Risk-sensitive planning in partially observable environments

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Towards a bridge between cost and wealth in risk-aware planning

Applied Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study how to find plans that maximize the expected total utility for a given MDP, a planning objective that is important for decision making in high-stakes domains. The optimal actions can now depend on the total reward that has been accumulated so far in addition to the current state. We extend our previous work on functional value iteration from one-switch utility functions to all utility functions that can be approximated with piecewise linear utility functions (with and without exponential tails) by using functional value iteration to find a plan that maximizes the expected total utility for the approximate utility function. Functional value iteration does not maintain a value for every state but a value function that maps the total reward that has been accumulated so far into a value. We describe how functional value iteration represents these value functions in finite form, how it performs dynamic programming by manipulating these representations and what kinds of approximation guarantees it is able to make. We also apply it to a probabilistic blocksworld problem, a standard test domain for decision-theoretic planners.