Optimal paths in graphs with stochastic or multidimensional weights
Communications of the ACM
LAO: a heuristic search algorithm that finds solutions with loops
Artificial Intelligence - Special issue on heuristic search in artificial intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
An Autonomous Spacecraft Agent Prototype
Autonomous Robots - Special issue on autonomous agents
Decision-Theoretic Control of Planetary Rovers
Revised Papers from the International Seminar on Advances in Plan-Based Control of Robotic Agents,
Sensor Planning with Non-linear Utility Functions
ECP '99 Proceedings of the 5th European Conference on Planning: Recent Advances in AI Planning
Piecewise linear value function approximation for factored MDPs
Eighteenth national conference on Artificial intelligence
A decision-support system for quote generation
Eighteenth national conference on Artificial intelligence
Fitting and Compilation of Multiagent Models through Piecewise Linear Functions
AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Decision-theoretic planning under risk-sensitive planning objectives
Decision-theoretic planning under risk-sensitive planning objectives
Risk-sensitive planning with one-switch utility functions: value iteration
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Lazy approximation for solving continuous finite-horizon MDPs
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
SPUDD: stochastic planning using decision diagrams
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Brief On terminating Markov decision processes with a risk-averse objective function
Automatica (Journal of IFAC)
Unknown rewards in finite-horizon domains
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Risk-sensitive planning in partially observable environments
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Towards a bridge between cost and wealth in risk-aware planning
Applied Intelligence
Hi-index | 0.00 |
We study how to find plans that maximize the expected total utility for a given MDP, a planning objective that is important for decision making in high-stakes domains. The optimal actions can now depend on the total reward that has been accumulated so far in addition to the current state. We extend our previous work on functional value iteration from one-switch utility functions to all utility functions that can be approximated with piecewise linear utility functions (with and without exponential tails) by using functional value iteration to find a plan that maximizes the expected total utility for the approximate utility function. Functional value iteration does not maintain a value for every state but a value function that maps the total reward that has been accumulated so far into a value. We describe how functional value iteration represents these value functions in finite form, how it performs dynamic programming by manipulating these representations and what kinds of approximation guarantees it is able to make. We also apply it to a probabilistic blocksworld problem, a standard test domain for decision-theoretic planners.