Control strategies for a stochastic planner
AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Planning under time constraints in stochastic domains
Artificial Intelligence - Special volume on planning and scheduling
Relational reinforcement learning
Machine Learning - Special issue on inducive logic programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
Least-squares policy iteration
The Journal of Machine Learning Research
An algebraic approach to abstraction in reinforcement learning
An algebraic approach to abstraction in reinforcement learning
Proto-value functions: developmental reinforcement learning
ICML '05 Proceedings of the 22nd international conference on Machine learning
Journal of Artificial Intelligence Research
Model minimization in Markov decision processes
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
The thing that we tried didn't work very well: deictic representation in reinforcement learning
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Hi-index | 0.01 |
The field of reinforcement learning (RL) has achieved great strides in learning control knowledge from closed-loop interaction with environments. "Classical" RL, based on atomic state space representations, suffers from an inability to adapt to nonstationarities in the target Markov decision process (i.e., environment). Relational RL is widely seen as being a potential solution to this shortcoming. In this paper, we demonstrate a class of "pseudo-relational" learning methods for nonstationary navigational RL domains --- domains in which the location of the goal, or even the structure of the environment, can change over time. Our approach is closely related to deictic representations, which have previously been found to be troublesome for RL. The key insight of this paper is that navigational problems are a highly constrained class of MDP, possessing a strong native topologythat relaxes some of the partial observability difficulties arising from deixis. Agents can employ local information that is relevant to their near-term action choices to act effectively. We demonstrate that, unlike an atomic representation, our agents can learn to fluidly adapt to changing goal locations and environment structure.