Memory Approaches to Reinforcement Learning in Non-Markovian Domains

Authors:
Long Lin;Tom Mitchell
Affiliations:
-;-
Venue:
Memory Approaches to Reinforcement Learning in Non-Markovian Domains
Year:
1992

Citing 0
Cited 18

Reinforcement Learning Agents

Artificial Intelligence Review
Adaptive Simulation: An Implementation Framework

SEAL'98 Selected papers from the Second Asia-Pacific Conference on Simulated Evolution and Learning on Simulated Evolution and Learning
Hidden-Mode Markov Decision Processes for Nonstationary Sequential Decision Making

Sequence Learning - Paradigms, Algorithms, and Applications
A POMDP formulation of preference elicitation problems

Eighteenth national conference on Artificial intelligence
A Reinforcement Learning Scheme for a Partially-Observable Multi-Agent Game

Machine Learning
Toward Optimal Classifier System Performance in Non-Markov Environments

Evolutionary Computation
A reinforcement learning framework for online data migration in hierarchical storage systems

The Journal of Supercomputing
Accelerating neuroevolutionary methods using a Kalman filter

Proceedings of the 10th annual conference on Genetic and evolutionary computation
Preconditioned temporal difference learning

Proceedings of the 25th international conference on Machine learning
Accelerated Neural Evolution through Cooperatively Coevolved Synapses

The Journal of Machine Learning Research
A comparison between ATNoSFERES and Learning Classifier Systems on non-Markov problems

Information Sciences: an International Journal
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Approximating optimal policies for partially observable stochastic domains

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Real-time reinforcement learning by sequential Actor-Critics and experience replay

Neural Networks
RL-Based Memory Controller for Scalable Autonomous Systems

ICONIP '09 Proceedings of the 16th International Conference on Neural Information Processing: Part II
An experimental comparison between ATNoSFERES and ACS

IWLCS'03-05 Proceedings of the 2003-2005 international conference on Learning classifier systems
ARKAQ-learning: autonomous state space segmentation and policy generation

ISCIS'05 Proceedings of the 20th international conference on Computer and Information Sciences
Reinforcement learning with echo state networks

ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reinforcement learning is a type of unsupervised learning for sequential decision making. Q-learning is probably the best-understood reinforcement learning algorithm. In Q-learning, the agent learns a mapping from states and actions to their utilities. An important assumption of Q-learning is the Markovian environment assumption, meaning that any information needed to determine the optimal actions is reflected in the agent''s state representation. Consider an agent whose state representation is based solely on its immediate perceptual sensations. When its sensors are not able to make essential distinctions among world states, the Markov assumption is violated, causing a problem called perceptual aliasing. For example, when facing a closed box, an agent based on its current visual sensation cannot act optimally if the optimal action depends on the contents of the box. There are two basic approaches to addressing this problem -- using more sensors or using history to figure out the current world state. This paper studies three connectionist approaches which learn to use history to handle perceptual aliasing: the window-Q, recurrent-Q, and recurrent-model architectures. Empirical study of these architectures is presented. Their relative strengths and weaknesses are also discussed.