Reinforcement Learning in Nonstationary Environment Navigation Tasks

  • Authors:
  • Terran Lane;Martin Ridens;Scott Stevens

  • Affiliations:
  • Department of Computer Science, University of New Mexico,;Department of Computer Science, University of New Mexico,;Department of Computer Science, University of New Mexico,

  • Venue:
  • CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
  • Year:
  • 2007

Quantified Score

Hi-index 0.01

Visualization

Abstract

The field of reinforcement learning (RL) has achieved great strides in learning control knowledge from closed-loop interaction with environments. "Classical" RL, based on atomic state space representations, suffers from an inability to adapt to nonstationarities in the target Markov decision process (i.e., environment). Relational RL is widely seen as being a potential solution to this shortcoming. In this paper, we demonstrate a class of "pseudo-relational" learning methods for nonstationary navigational RL domains --- domains in which the location of the goal, or even the structure of the environment, can change over time. Our approach is closely related to deictic representations, which have previously been found to be troublesome for RL. The key insight of this paper is that navigational problems are a highly constrained class of MDP, possessing a strong native topologythat relaxes some of the partial observability difficulties arising from deixis. Agents can employ local information that is relevant to their near-term action choices to act effectively. We demonstrate that, unlike an atomic representation, our agents can learn to fluidly adapt to changing goal locations and environment structure.