Training and tracking in robotics

Authors:
Oliver G. Selfridge;Richard S. Sutton;Andrew G. Barto
Affiliations:
GTE Labs, Waltham, MA;GTE Labs, Waltham, MA;Department of Computer and Information Science, University of Massachusetts, Amherst, MA
Venue:
IJCAI'85 Proceedings of the 9th international joint conference on Artificial intelligence - Volume 1
Year:
1985

Citing 0
Cited 8

Transfer Learning for Reinforcement Learning Domains: A Survey

The Journal of Machine Learning Research
Learning to control a dynamic physical system

AAAI'87 Proceedings of the sixth National conference on Artificial intelligence - Volume 2
Learning to control a dynamic physical system

AAAI'87 Proceedings of the sixth National conference on Artificial intelligence - Volume 2
Integrating reinforcement learning with human demonstrations of varying ability

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Using advice to transfer knowledge acquired in one reinforcement learning task to another

ECML'05 Proceedings of the 16th European conference on Machine Learning
Reinforcement learning transfer via sparse coding

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Transfer in reinforcement learning via shared features

The Journal of Machine Learning Research
Learning potential functions and their representations for multi-task reinforcement learning

Autonomous Agents and Multi-Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We explore the use of learning schemes in training and adapting performance on simple coordination tasks. The tasks are 1-D pole balancing. Several programs incorporating learning have already achieved this (1, S, 8): the problem is to move a cart along a short piece of track to at to keep a pole balanced on its end; the pole is hinged to the cart at its bottom, and the cart is moved either to the left or to the right by a force of constant magnitude. The form of the task considered here, after (3), involves a genuinely difficult credit-assignment problem. We use a learning scheme previously developed and analysed (1, 7) to achieve performance through reinforcement, and extend it to include changing and new requirements. For example, the length or mast of the pole can change, the bias of the force, its strength, and so on; and the system can be tasked to avoid certain regions altogether. In this way we explore the learning system's ability to adapt to changes and to profit from a selected training sequence, both of which are of obvious utility in practical robotics applications. The results described here were obtained using a computer simulation of the pole-balancing problem. A movie will be shown of the performance of the system under the various requirements and tasks.