Proceedings of the seventh international conference (1990) on Machine learning
Automatic programming of behavior-based robots using reinforcement learning
Artificial Intelligence
Practical Issues in Temporal Difference Learning
Machine Learning
Technical Note: \cal Q-Learning
Machine Learning
Reinforcement learning with replacing eligibility traces
Machine Learning - Special issue on reinforcement learning
Measuring the effectiveness of reinforcement learning for behavior-based robots
Adaptive Behavior - Special issue on environment structure and behavior
Tree based discretization for continuous state space reinforcement learning
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Neural Networks: A Comprehensive Foundation
Neural Networks: A Comprehensive Foundation
An Behavior-based Robotics
Reinforcement Learning
Variable Resolution Discretization in Optimal Control
Machine Learning
RL-TOPS: An Architecture for Modularity and Re-Use in Reinforcement Learning
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Making reinforcement learning work on real robots
Making reinforcement learning work on real robots
Situated agents can have goals
Robotics and Autonomous Systems
An analytical framework for local feedforward networks
IEEE Transactions on Neural Networks
Q-Learning Based on Dynamical Structure Neural Network for Robot Navigation in Unknown Environment
ISNN 2009 Proceedings of the 6th International Symposium on Neural Networks: Advances in Neural Networks - Part III
Application of reinforcement learning to autonomous heading control for bionic underwater robots
ROBIO'09 Proceedings of the 2009 international conference on Robotics and biomimetics
Hi-index | 0.00 |
This paper describes the Semi-Online Neural-Q-learning (SONQL) algorithm designed for real-time learning of reactive robot behaviors. The Q-function is generalized by a multilayer neural network allowing the use of continuous states. The algorithm uses a database of the most recent learning samples to accelerate and improve the convergence. Each SONQL algorithm represents an independent, reactive and adaptive state-action mapping, which implements the function of a robot behavior for one degree of freedom (DOF). The generalization capability of the SONQL algorithm was demonstrated by computer simulation with the ''mountain-car'' benchmark. The SONQL was also investigated by experiment on a mobile robot for a target-following task. Experimental results show that the SONQL is promising for online robot learning.