Continuous state/action reinforcement learning: A growing self-organizing map approach

Authors:
Hesam Montazeri;Sajjad Moradi;Reza Safabakhsh
Affiliations:
Department of Computer Engineering and Information Technology, Amirkabir University of Technology, Tehran 15914, Iran;Department of Computer Science and Engineering, University of Texas at Arlington, TX, USA;Department of Computer Engineering and Information Technology, Amirkabir University of Technology, Tehran 15914, Iran
Venue:
Neurocomputing
Year:
2011

Citing 22
Cited 2

Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system

Neural Networks
Practical Issues in Temporal Difference Learning

Machine Learning
Growing cell structures—a self-organizing network for unsupervised and supervised learning

Neural Networks
Experiments with reinforcement learning in problems with continuous state and action spaces

Adaptive Behavior
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Continuous-Action Q-Learning

Machine Learning
Scaling Reinforcement Learning toward RoboCup Soccer

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Least-Squares Temporal Difference Learning

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Practical Reinforcement Learning in Continuous Spaces

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Multi-Agent Reinforcement Leraning for Traffic Light Control

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Q-Learning with Hidden-Unit Restarting

Advances in Neural Information Processing Systems 5, [NIPS Conference]
Q-Learning in Continuous State and Action Spaces

AI '99 Proceedings of the 12th Australian Joint Conference on Artificial Intelligence: Advanced Topics in Artificial Intelligence
Applications of the self-organising map to reinforcement learning

Neural Networks - New developments in self-organizing maps
Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning)

Learning and problem-solving with multilayer connectionist systems (adaptive, strategy learning, neural networks, reinforcement learning)
Fuzzy Q-Learning with the Modified ART Neural Network

IAT '05 Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Technology
Reinforcement Learning in Continuous Time and Space

Neural Computation
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning

Artificial Intelligence
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
A dynamic allocation method of basis functions in reinforcement learning

AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
Downlink packet scheduling with minimum throughput guarantee in TDD-OFDMA cellular network

NETWORKING'05 Proceedings of the 4th IFIP-TC6 international conference on Networking Technologies, Services, and Protocols; Performance of Computer and Communication Networks; Mobile and Wireless Communication Systems
Broadband wireless access solutions based on OFDM access in IEEE 802.16

IEEE Communications Magazine

A novel reinforcement learning architecture for continuous state and action spaces

Advances in Artificial Intelligence
Autonomous intelligent decision-making system based on Bayesian SOM neural network for robot soccer

Neurocomputing

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper proposes an algorithm to deal with continuous state/action space in the reinforcement learning (RL) problem. Extensive studies have been done to solve the continuous state RL problems, but more research should be carried out for RL problems with continuous action spaces. Due to non-stationary, very large size, and continuous nature of RL problems, the proposed algorithm uses two growing self-organizing maps (GSOM) to elegantly approximate the state/action space through addition and deletion of neurons. It has been demonstrated that GSOM has a better performance in topology preservation, quantization error reduction, and non-stationary distribution approximation than the standard SOM. The novel algorithm proposed in this paper attempts to simultaneously find the best representation for the state space, accurate estimation of Q-values, and appropriate representation for highly rewarded regions in the action space. Experimental results on delayed reward, non-stationary, and large-scale problems demonstrate very satisfactory performance of the proposed algorithm.