Neural fitted q iteration – first experiences with a data efficient neural reinforcement learning method

Authors:
Martin Riedmiller
Affiliations:
Neuroinformatics Group, University of Onsabrück, Osnabrück
Venue:
ECML'05 Proceedings of the 16th European conference on Machine Learning
Year:
2005

Citing 5
Cited 29

Practical Issues in Temporal Difference Learning

Machine Learning
Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Least-squares policy iteration

The Journal of Machine Learning Research
Tree-Based Batch Mode Reinforcement Learning

The Journal of Machine Learning Research

Evolutionary Function Approximation for Reinforcement Learning

The Journal of Machine Learning Research
Empirical Studies in Action Selection with Reinforcement Learning

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Batch reinforcement learning in a complex domain

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Model-based function approximation in reinforcement learning

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Non-parametric policy gradients: a unified treatment of propositional and relational domains

Proceedings of the 25th international conference on Machine learning
Finite-Time Bounds for Fitted Value Iteration

The Journal of Machine Learning Research
Rollout sampling approximate policy iteration

Machine Learning
Reinforcement learning for DEC-MDPs with changing action sets and partially ordered dependencies

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3
An Analysis of Case-Based Value Function Approximation by Approximating State Transition Graphs

ICCBR '07 Proceedings of the 7th international conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
Fitted Natural Actor-Critic: A New Algorithm for Continuous State-Action MDPs

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Evaluation of Batch-Mode Reinforcement Learning Methods for Solving DEC-MDPs with Changing Action Sets

Recent Advances in Reinforcement Learning
Gaussian process dynamic programming

Neurocomputing
Learning complex motions by sequencing simpler motion templates

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Reinforcement learning for robot soccer

Autonomous Robots
Sample-efficient evolutionary function approximation for reinforcement learning

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Compositional Models for Reinforcement Learning

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Reinforcement learning versus model predictive control: a comparison on a power system problem

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Adaptive autonomous control using online value iteration with Gaussian processes

ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Using continuous action spaces to solve discrete problems

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Neural dynamic programming based temperature optimal control for cement calcined process

CCDC'09 Proceedings of the 21st annual international conference on Chinese control and decision conference
Improving optimality of neural rewards regression for data-efficient batch near-optimal policy identification

ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning

Autonomous Agents and Multi-Agent Systems
Reinforcement learning based neural controllers for dynamic processes without exploration

ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part II
Hessian matrix distribution for Bayesian policy gradient reinforcement learning

Information Sciences: an International Journal
Sequential feature selection for classification

AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
Reinforcement learning with a bilinear q function

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Learn to swing up and balance a real pole based on raw visual input data

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part V
Machine learning for interactive systems and robots: a brief introduction

Proceedings of the 2nd Workshop on Machine Learning for Interactive Systems: Bridging the Gap Between Perception, Action and Communication
A brain-computer interface for high-level remote control of an autonomous, reinforcement-learning-based robotic system for reaching and grasping

Proceedings of the 19th international conference on Intelligent User Interfaces

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces NFQ, an algorithm for efficient and effective training of a Q-value function represented by a multi-layer perceptron. Based on the principle of storing and reusing transition experiences, a model-free, neural network based Reinforcement Learning algorithm is proposed. The method is evaluated on three benchmark problems. It is shown empirically, that reasonably few interactions with the plant are needed to generate control policies of high quality.