Satisficing strategies for resource-limited policy search in dynamic environments

  • Authors:
  • Dmitri Dolgov;Edmund H. Durfee

  • Affiliations:
  • University of Michigan, Ann Arbor, MI;University of Michigan, Ann Arbor, MI

  • Venue:
  • Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 3
  • Year:
  • 2002

Quantified Score

Hi-index 0.01

Visualization

Abstract

In this work, we examine the problem of searching for schedulable real-time control policies for resource-limited agents acting in dynamic environments. The dynamic properties of the environment and resource limitations of the agent render the problem of solving for an optimal policy infeasible. We therefore limit our search to a satisficing, rather than an optimal policy. We view the policy search as a search for a state reachability graph, with an action assigned to each of the states in the graph. In the search algorithm, we exploit properties of the reachability graphs to propagate failure conditions from inherent failure states to other states in the reachability graph, which allows us to exploit constraint satisfaction techniques to quickly remove some unacceptable policies from consideration. Our analysis and experiments show that, under certain conditions, such as when the "safe" states in the reachability graph are separated from the failure states by a relatively small set of states, we can use backtracking and memoization techniques that significantly improve the efficiency of the search algorithm.