Reinforcement Learning in Nonstationary Environment Navigation Tasks

Authors:
Terran Lane;Martin Ridens;Scott Stevens
Affiliations:
Department of Computer Science, University of New Mexico,;Department of Computer Science, University of New Mexico,;Department of Computer Science, University of New Mexico,
Venue:
CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Year:
2007

Citing 12
Cited 0

Control strategies for a stochastic planner

AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Planning under time constraints in stochastic domains

Artificial Intelligence - Special volume on planning and scheduling
Relational reinforcement learning

Machine Learning - Special issue on inducive logic programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Least-squares policy iteration

The Journal of Machine Learning Research
An algebraic approach to abstraction in reinforcement learning

An algebraic approach to abstraction in reinforcement learning
Proto-value functions: developmental reinforcement learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
Approximate policy iteration with a policy language bias: solving relational Markov decision processes

Journal of Artificial Intelligence Research
Model minimization in Markov decision processes

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
The thing that we tried didn't work very well: deictic representation in reinforcement learning

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.01

Visualization

Abstract

The field of reinforcement learning (RL) has achieved great strides in learning control knowledge from closed-loop interaction with environments. "Classical" RL, based on atomic state space representations, suffers from an inability to adapt to nonstationarities in the target Markov decision process (i.e., environment). Relational RL is widely seen as being a potential solution to this shortcoming. In this paper, we demonstrate a class of "pseudo-relational" learning methods for nonstationary navigational RL domains --- domains in which the location of the goal, or even the structure of the environment, can change over time. Our approach is closely related to deictic representations, which have previously been found to be troublesome for RL. The key insight of this paper is that navigational problems are a highly constrained class of MDP, possessing a strong native topologythat relaxes some of the partial observability difficulties arising from deixis. Agents can employ local information that is relevant to their near-term action choices to act effectively. We demonstrate that, unlike an atomic representation, our agents can learn to fluidly adapt to changing goal locations and environment structure.