Model-based and model-free reinforcement learning for visual servoing

Authors:
Amir Massoud Farahmand;Azad Shademan;Martin Jägersand;Csaba Szepesvári
Affiliations:
Department of Computing Science, University of Alberta, Canada;Department of Computing Science, University of Alberta, Canada;Department of Computing Science, University of Alberta, Canada;Department of Computing Science, University of Alberta, Canada
Venue:
ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Year:
2009

Citing 9
Cited 0

Stochastic Optimal Control: The Discrete-Time Case

Stochastic Optimal Control: The Discrete-Time Case
Tree-Based Batch Mode Reinforcement Learning

The Journal of Machine Learning Research
Reinforcement learning with Gaussian processes

ICML '05 Proceedings of the 22nd international conference on Machine learning
All of Nonparametric Statistics (Springer Texts in Statistics)

All of Nonparametric Statistics (Springer Texts in Statistics)
Analyzing feature generation for value-function approximation

Proceedings of the 24th international conference on Machine learning
Empirical Bernstein stopping

Proceedings of the 25th international conference on Machine learning
Least Squares SVM for Least Squares TD Learning

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
The kernel recursive least-squares algorithm

IEEE Transactions on Signal Processing
Capacity of reproducing kernel spaces in learning theory

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

To address the difficulty of designing a controller for complex visual-servoing tasks, two learning-based uncalibrated approaches are introduced. The first method starts by building an estimated model for the visual-motor forward kinematic of the vision-robot system by a locally linear regression method. Afterwards, it uses a reinforcement learning method named Regularized Fitted Q-Iteration to find a controller (i.e. policy) for the system (model-based RL). The second method directly uses samples coming from the robot without building any intermediate model (model-free RL). The simulation results show that both methods perform comparably well despite not having any a priori knowledge about the robot.