Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Stochastic Dynamic Programming: Probability and Mathematical
Introduction to Stochastic Dynamic Programming: Probability and Mathematical
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
Recent Advances in Hierarchical Reinforcement Learning
Discrete Event Dynamic Systems
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Policy gradient learning for a humanoid soccer robot
Robotics and Autonomous Systems
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Reinforcement learning of competitive and cooperative skills in soccer agents
Applied Soft Computing
Hi-index | 0.00 |
We introduce a reinforcement learning architecture designed for problems with an infinite number of states, where each state can be seen as a vector of real numbers and with a finite number of actions, where each action requires a vector of real numbers as parameters. The main objective of this architecture is to distribute in two actors the work required to learn the final policy. One actor decideswhat actionmust be performed;meanwhile, a second actor determines the right parameters for the selected action. We tested our architecture and one algorithmbased on it solving the robot dribbling problem, a challenging robot control problem taken from the RoboCup competitions. Our experimental work with three different function approximators provides enough evidence to prove that the proposed architecture can be used to implement fast, robust, and reliable reinforcement learning algorithms.