2013 Special Issue: Autonomous reinforcement learning with experience replay

Authors:
Paweł WawrzyńSki;Ajay Kumar Tanwani
Affiliations:
Warsaw University of Technology, Institute of Control and Computation Engineering, Poland;ícole Polytechnique Fédérale De Lausanne, Switzerland
Venue:
Neural Networks
Year:
2013

Citing 19
Cited 0

Integrated architecture for learning, planning, and reacting based on approximating dynamic programming

Proceedings of the seventh international conference (1990) on Machine learning
Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

Machine Learning
Simulation and the Monte Carlo Method

Simulation and the Monte Carlo Method
An Analysis of Actor/Critic Algorithms Using Eligibility Traces: Reinforcement Learning with Imperfect Value Function

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Bounds on Sample Size for Policy Evaluation in Markov Environments

COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
On Actor-Critic Algorithms

SIAM Journal on Control and Optimization
Exploration and apprenticeship learning in reinforcement learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming

Machine Learning
Neighborhood based modified backpropagation algorithm using adaptive learning parameters for training feedforward neural networks

Neurocomputing
Natural actor-critic algorithms

Automatica (Journal of IFAC)
Real-time reinforcement learning by sequential Actor-Critics and experience replay

Neural Networks
Recursive Adaptation of Stepsize Parameter for Non-stationary Environments

PRIMA '09 Proceedings of the 12th International Conference on Principles of Practice in Multi-Agent Systems
Evolving neural networks in compressed weight space

Proceedings of the 12th annual conference on Genetic and evolutionary computation
Evolving a single scalable controller for an octopus arm with a variable number of segments

PPSN'10 Proceedings of the 11th international conference on Parallel problem solving from nature: Part II
Policy search for motor primitives in robotics

Machine Learning
Reward-weighted regression with sample reuse for direct policy search in reinforcement learning

Neural Computation
Fixed point method for autonomous on-line neural network training

Neurocomputing
On Adaptive Learning Rate That Guarantees Convergence in Feedforward Networks

IEEE Transactions on Neural Networks
Experience Replay for Real-Time Reinforcement Learning Control

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper considers the issues of efficiency and autonomy that are required to make reinforcement learning suitable for real-life control tasks. A real-time reinforcement learning algorithm is presented that repeatedly adjusts the control policy with the use of previously collected samples, and autonomously estimates the appropriate step-sizes for the learning updates. The algorithm is based on the actor-critic with experience replay whose step-sizes are determined on-line by an enhanced fixed point algorithm for on-line neural network training. An experimental study with simulated octopus arm and half-cheetah demonstrates the feasibility of the proposed algorithm to solve difficult learning control problems in an autonomous way within reasonably short time.