Module-Based Reinforcement Learning: Experiments with a Real Robot

Authors:
Zsolt Kalmár;Csaba Szepesvári;András Lőrincz
Affiliations:
Department of Informatics, “József Attila” University of Szeged, Szeged, Aradi vrt. tere 1, Hungary H-6720. E-mail: kalmar@mindmaker.kfkipark.hu;Research Group on Artificial Intelligence, “József Attila” Uni versity of Szeged, Szeged, Aradi vrt. tere 1, Hungary H-6720. E-mail: szepes@mindmaker.kfkipark.hu;Department of Adaptive Systems, “József Attila” University of Szeged, Szeged, Aradi vrt. tere 1, Hungary H-6720. E-mail: lorincz@mindmaker.kfkipark.hu
Venue:
Autonomous Robots
Year:
1998

Citing 33
Cited 1

A qualitative physics based on confluences

Artificial Intelligence - Special volume on qualitative reasoning about physical systems
Macro-operators: a weak method for learning

Artificial Intelligence - Lecture notes in computer science 178
Learning to solve problems by searching for macro-operators

Learning to solve problems by searching for macro-operators
Planning as search: a quantitative approach

Artificial Intelligence
A bottom-up mechanism for behavior selection in an artificial creature

Proceedings of the first international conference on simulation of adaptive behavior on From animals to animats
Automatic programming of behavior-based robots using reinforcement learning

Artificial Intelligence
Technical Note: \cal Q-Learning

Machine Learning
Behavior of an adaptive self-organizing autonomous agent working with cues and competing concepts

Adaptive Behavior
Asynchronous Stochastic Approximation and Q-Learning

Machine Learning
Robot shaping: developing autonomous agents through learning

Artificial Intelligence
Learning to act using real-time dynamic programming

Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
ALECSYS and the AutonoMouse: Learning to Control a Real Robot by Distributed Classifier Systems

Machine Learning
Feature-based methods for large scale dynamic programming

Machine Learning - Special issue on reinforcement learning
The loss from imperfect value functions in expectation-based and minimax-based tasks

Machine Learning - Special issue on reinforcement learning
Qualitative system identification: deriving structure from behavior

Artificial Intelligence
Purposive behavior acquisition for a real robot by vision-based reinforcement learning

Machine Learning - Special issue on robot learning
Studies in hybrid systems: modeling, analysis, and control

Studies in hybrid systems: modeling, analysis, and control
Modeling agents as qualitative decision makers

Artificial Intelligence - Special issue on economic principles of multi-agent systems
HQ-learning

Adaptive Behavior
Reinforcement learning with hierarchies of machines

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
How to dynamically merge Markov decision processes

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Convergence Results for Single-Step On-PolicyReinforcement-Learning Algorithms

Machine Learning
Reinforcement Learning in the Multi-Robot Domain

Autonomous Robots
Hybrid Systems

Hybrid Systems
Finite-Element Methods with Local Triangulation Refinement for Continuous Reimforcement Learning Problems

ECML '97 Proceedings of the 9th European Conference on Machine Learning
Learning and Exploitation Do Not Conflict Under Minimax Optimality

ECML '97 Proceedings of the 9th European Conference on Machine Learning
Hierarchical Hybrid Control: A Case Study

Hybrid Systems II
Dynamic Programming

Dynamic Programming
Temporal credit assignment in reinforcement learning

Temporal credit assignment in reinforcement learning
Algorithms for sequential decision-making

Algorithms for sequential decision-making
Human Problem Solving

Human Problem Solving
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Behavior analysis and training-a methodology for behaviorengineering

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Acquiring Mobile Robot Behaviors by Learning Trajectory Velocities

Autonomous Robots

Quantified Score

Hi-index	0.00

Visualization

Abstract

The behavior of reinforcement learning (RL)algorithms is best understood in completely observable, discrete-timecontrolled Markov chains with finite state and action spaces. Incontrast, robot-learning domains are inherently continuous both intime and space, and moreover are partially observable. Here wesuggest a systematic approach to solve such problems in which theavailable qualitative and quantitative knowledge is used to reducethe complexity of learning task. The steps of the design process areto: (i) decompose the task into subtasks using the qualitativeknowledge at hand; (ii) design local controllers to solve thesubtasks using the available quantitative knowledge, and (iii) learna coordination of these controllers by means of reinforcementlearning. It is argued that the approach enables fast,semi-automatic, but still high quality robot-control as nofine-tuning of the local controllers is needed. The approach wasverified on a non-trivial real-life robot task. Several RLalgorithms were compared by ANOVA and it was found that themodel-based approach worked significantly better than the model-freeapproach. The learnt switching strategy performed comparably to ahandcrafted version. Moreover, the learnt strategy seemed to exploitcertain properties of the environment which were not foreseen inadvance, thus supporting the view that adaptive algorithms areadvantageous to nonadaptive ones in complex environments.