Policy gradient learning for quadruped soccer robots

Authors:
A. Cherubini;F. Giannone;L. Iocchi;D. Nardi;P. F. Palamara
Affiliations:
Dipartimento di Informatica e Sistemistica, Sapienza University of Roma, Via Ariosto 25, 00185 Roma, Italy and INRIA/IRISA, Campus de Beaulieu, 35042 Rennes Cedex, France;Dipartimento di Informatica e Sistemistica, Sapienza University of Roma, Via Ariosto 25, 00185 Roma, Italy;Dipartimento di Informatica e Sistemistica, Sapienza University of Roma, Via Ariosto 25, 00185 Roma, Italy;Dipartimento di Informatica e Sistemistica, Sapienza University of Roma, Via Ariosto 25, 00185 Roma, Italy;Dipartimento di Informatica e Sistemistica, Sapienza University of Roma, Via Ariosto 25, 00185 Roma, Italy
Venue:
Robotics and Autonomous Systems
Year:
2010

Citing 1
Cited 1

Machine Learning With AIBO Robots in the Four-Legged League of RoboCup

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

Autonomous intelligent decision-making system based on Bayesian SOM neural network for robot soccer

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In real-world robotic applications, many factors, both at low level (e.g., vision, motion control and behaviors) and at high level (e.g., plans and strategies) determine the quality of the robot performance. Consequently, fine tuning of the parameters, in the implementation of the basic functionalities, as well as in the strategic decisions, is a key issue in robot software development. In recent years, machine learning techniques have been successfully used to find optimal parameters for typical robotic functionalities. However, one major drawback of learning techniques is time consumption: in practical applications, methods designed for physical robots must be effective with small amounts of data. In this paper, we present a method for concurrent learning of best strategy and optimal parameters using policy gradient reinforcement learning algorithm. The results of our experimental work in a simulated environment and on a real robot show a very high convergence rate.