Probabilistic planning with non-linear utility functions and worst-case guarantees

  • Authors:
  • Stefano Ermon;Carla Gomes;Bart Selman;Alexander Vladimirsky

  • Affiliations:
  • Cornell University;Cornell University;Cornell University;Cornell University

  • Venue:
  • Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Markov Decision Processes are one of the most widely used frameworks to formulate probabilistic planning problems. Since planners are often risk-sensitive in high-stake situations, non-linear utility functions are often introduced to describe their preferences among all possible outcomes. Alternatively, risk-sensitive decision makers often require their plans to satisfy certain worst-case guarantees. We show how to combine these two approaches by considering problems where we maximize the expected utility of the total reward subject to worst-case constraints. We generalize several existing results on the structure of optimal policies to the constrained case, both for finite and infinite horizon problems. We provide a Dynamic Programming algorithm to compute the optimal policy, and we introduce an admissible heuristic to effectively prune the search space. Finally, we use a stochastic shortest path problem on large real-world road networks to demonstrate the practical applicability of our method.