Proceedings of the seventh international conference (1990) on Machine learning
Simulation and the Monte Carlo Method
Simulation and the Monte Carlo Method
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Bounds on Sample Size for Policy Evaluation in Markov Environments
COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
SIAM Journal on Control and Optimization
Exploration and apprenticeship learning in reinforcement learning
ICML '05 Proceedings of the 22nd international conference on Machine learning
Natural actor-critic algorithms
Automatica (Journal of IFAC)
Recursive Adaptation of Stepsize Parameter for Non-stationary Environments
PRIMA '09 Proceedings of the 12th International Conference on Principles of Practice in Multi-Agent Systems
Evolving neural networks in compressed weight space
Proceedings of the 12th annual conference on Genetic and evolutionary computation
Evolving a single scalable controller for an octopus arm with a variable number of segments
PPSN'10 Proceedings of the 11th international conference on Parallel problem solving from nature: Part II
Policy search for motor primitives in robotics
Machine Learning
On Adaptive Learning Rate That Guarantees Convergence in Feedforward Networks
IEEE Transactions on Neural Networks
Experience Replay for Real-Time Reinforcement Learning Control
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Hi-index | 0.00 |
This paper considers the issues of efficiency and autonomy that are required to make reinforcement learning suitable for real-life control tasks. A real-time reinforcement learning algorithm is presented that repeatedly adjusts the control policy with the use of previously collected samples, and autonomously estimates the appropriate step-sizes for the learning updates. The algorithm is based on the actor-critic with experience replay whose step-sizes are determined on-line by an enhanced fixed point algorithm for on-line neural network training. An experimental study with simulated octopus arm and half-cheetah demonstrates the feasibility of the proposed algorithm to solve difficult learning control problems in an autonomous way within reasonably short time.