Risk-sensitive planning in partially observable environments

Authors:
Janusz Marecki;Pradeep Varakantham
Affiliations:
IBM T. J. Watson Research Center, Yorktown Heights, NY;Singapore Management University, Singapore
Venue:
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Year:
2010

Citing 11
Cited 0

One-switch utility functions and a measure of risk

Management Science
SouthamptonTAC: An adaptive autonomous trading agent

ACM Transactions on Internet Technology (TOIT)
Dynamic Programming

Dynamic Programming
Region-based incremental pruning for POMDPs

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Point-Based Value Iteration for Continuous POMDPs

The Journal of Machine Learning Research
An exact algorithm for solving MDPs under risk-sensitive planning objectives with one-switch utility functions

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Functional value iteration for decision-theoretic planning with general utility functions

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Value-function approximations for partially observable Markov decision processes

Journal of Artificial Intelligence Research
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Towards efficient computation of error bounded solutions in POMDPs: expected value approximation and dynamic disjunctive beliefs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Planning under continuous time and resource uncertainty: a challenge for AI

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Partially Observable Markov Decision Process (POMDP) is a popular framework for planning under uncertainty in partially observable domains. Yet, the POMDP model is risk-neutral in that it assumes that the agent is maximizing the expected reward of its actions. In contrast, in domains like financial planning, it is often required that the agent decisions are risk-sensitive (maximize the utility of agent actions, for non-linear utility functions). Unfortunately, existing POMDP solvers cannot solve such planning problems exactly. By considering piecewise linear approximations of utility functions, this paper addresses this shortcoming in three contributions: (i) It defines the Risk-Sensitive POMDP model; (ii) It derives the fundamental properties of the underlying value functions and provides a functional value iteration technique to compute them exactly and (c) It proposes an efficient procedure to determine the dominated value functions, to speed up the algorithm. Our experiments show that the proposed approach is feasible and applicable to realistic financial planning domains.