Planning and programming with first-order markov decision processes: insights and challenges

  • Authors:
  • Craig Boutilier

  • Affiliations:
  • University of Toronto, Toronto

  • Venue:
  • TARK '01 Proceedings of the 8th conference on Theoretical aspects of rationality and knowledge
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Markov decision processes (MDPs) have become the de facto standard model for decision-theoretic planning problems. However, classic dynamic programming algorithms for MDPs [22] require explicit state and action enumeration. For example, the classical representation of a value function is a table or vector associating a value with each system state; such value functions are produced by iterating over the state space. Since state spaces grow exponentially with the number of domain features, the direct application of these models to AI planning problems is limited. Furthermore, for infinite and continuous spaces, such methods cannot be used without special knowledge of the form of the value function or optimal control policy.