A novel reinforcement learning architecture for continuous state and action spaces

Authors:
Víctor Uc-Cetina
Affiliations:
Facultad de Matemáticas, Universidad Autónoma de Yucatán, Yucatán, Mexico
Venue:
Advances in Artificial Intelligence
Year:
2013

Citing 12
Cited 0

Convergence Results for Single-Step On-PolicyReinforcement-Learning Algorithms

Machine Learning
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Stochastic Dynamic Programming: Probability and Mathematical

Introduction to Stochastic Dynamic Programming: Probability and Mathematical
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Recent Advances in Hierarchical Reinforcement Learning

Discrete Event Dynamic Systems
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Policy gradient learning for a humanoid soccer robot

Robotics and Autonomous Systems
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Reinforcement learning of competitive and cooperative skills in soccer agents

Applied Soft Computing
Continuous state/action reinforcement learning: A growing self-organizing map approach

Neurocomputing
Keepaway soccer: from machine learning testbed to benchmark

RoboCup 2005

Quantified Score

Hi-index	0.00

Visualization

Abstract

We introduce a reinforcement learning architecture designed for problems with an infinite number of states, where each state can be seen as a vector of real numbers and with a finite number of actions, where each action requires a vector of real numbers as parameters. The main objective of this architecture is to distribute in two actors the work required to learn the final policy. One actor decideswhat actionmust be performed;meanwhile, a second actor determines the right parameters for the selected action. We tested our architecture and one algorithmbased on it solving the robot dribbling problem, a challenging robot control problem taken from the RoboCup competitions. Our experimental work with three different function approximators provides enough evidence to prove that the proposed architecture can be used to implement fast, robust, and reliable reinforcement learning algorithms.