Undesired state-action prediction in multi-agent reinforcement learning for linked multi-component robotic system control

Authors:
Borja Fernandez-Gauna;Ion Marques;Manuel GrañA
Affiliations:
Computational Intelligence Group, UPV/EHU, Dept. CCIA, Spain;Computational Intelligence Group, UPV/EHU, Dept. CCIA, Spain;Computational Intelligence Group, UPV/EHU, Dept. CCIA, Spain
Venue:
Information Sciences: an International Journal
Year:
2013

Citing 35
Cited 0

Technical Note: \cal Q-Learning

Machine Learning
Bagging predictors

Machine Learning
Shape quantization and recognition with randomized trees

Neural Computation
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
Reinforcement Learning

Reinforcement Learning
Random Forests

Machine Learning
Reinforcement Learning in the Multi-Robot Domain

Autonomous Robots
An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Reinforcement Learning with Bounded Risk

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
An Overview of MAXQ Hierarchical Reinforcement Learning

SARA '02 Proceedings of the 4th International Symposium on Abstraction, Reformulation, and Approximation
Combining Pattern Classifiers: Methods and Algorithms

Combining Pattern Classifiers: Methods and Algorithms
Exploration and apprenticeship learning in reinforcement learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
Streaming Random Forests

IDEAS '07 Proceedings of the 11th International Database Engineering and Applications Symposium
Geometrically exact dynamic splines

Computer-Aided Design
Multi-robot Cooperation Based on Hierarchical Reinforcement Learning

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part III: ICCS 2007
Risk-sensitive reinforcement learning applied to control under constraints

Journal of Artificial Intelligence Research
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
A novel low-cost, limited-resource approach to autonomous multi-robot exploration and mapping

Robotics and Autonomous Systems
On the potential contributions of hybrid intelligent approaches to Multicomponent Robotic System development

Information Sciences: an International Journal
An incremental extremely random forest classifier for online learning and tracking

ICIP'09 Proceedings of the 16th IEEE international conference on Image processing
Linked multi-component mobile robots: Modeling, simulation and control

Robotics and Autonomous Systems
Open-ended evolution as a means to self-organize heterogeneous multi-robot systems in real time

Robotics and Autonomous Systems
Modeling and Optimization of Adaptive Foraging in Swarm Robotic Systems

International Journal of Robotics Research
Distributed consensus algorithms for merging feature-based maps with limited communication

Robotics and Autonomous Systems
Self-organizing state aggregation for architecture design of Q-learning

Information Sciences: an International Journal
A review on modelling, implementation, and control of snake robots

Robotics and Autonomous Systems
Reinforcement learning for MDPs with constraints

ECML'06 Proceedings of the 17th European conference on Machine Learning
Development of membrane controllers for mobile robots

Information Sciences: an International Journal
A Comprehensive Survey of Multiagent Reinforcement Learning

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
SVM-Based Tree-Type Neural Networks as a Critic in Adaptive Critic Designs for Control

IEEE Transactions on Neural Networks
Multi-robot coalition formation in real-time scenarios

Robotics and Autonomous Systems
Influences of the robot group size on cooperative multi-robot localisation-Analysis and experimental validation

Robotics and Autonomous Systems
Multilevel-based topology design and shape control of robot swarms

Automatica (Journal of IFAC)
IMPROVING THE CONTROL OF SINGLE ROBOT HOSE TRANSPORT

Cybernetics and Systems - LEARNING, SCHEDULING, RESOURCE OPTIMIZATION, AND EVOLUTION IN SMART ARTIFICIAL SYSTEMS: CHALLENGES AND SUPPORT

Quantified Score

Hi-index	0.07

Visualization

Abstract

The paper deals with the problem of learning the control of Multi-Component Robotic Systems (MCRSs) applying Multi-Agent Reinforcement Learning (MARL) algorithms. Modeling Linked MCRS usually leads to over-constrained environments, posing great difficulties for efficient learning with conventional single and multi-agent reinforcement algorithms. In this paper, we propose a hybrid learning algorithm composed of a modified Q-Learning algorithm embedding an Undesired State-Action Prediction (USAP) module trained by a supervised learning approach which learns a model predicting undesired transitions to states breaking physical constraints. The USAP module's output is used by the Q-Learning algorithm to prevent these undesired transitions, therefore boosting learning efficiency. This hybrid approach is extended to the multi-agent case embedding the USAP module in Distributed Round-Robin Q-Learning (D-RR-QL), which requires very little communications among agents. We present results of computational experiments conducted in the classical multi-agent taxi scheduling task and a hose transportation task. Results show a considerable learning gain both in time and accuracy, compared to the state-of-the-art Distributed Q-Learning approach in the deterministic taxi scheduling task. In the hose transportation task, USAP module introduces a significant improvement in learning convergence speed.