Comparative evaluation of MAL algorithms in a diverse set of ad hoc team problems

Authors:
Stefano V. Albrecht;Subramanian Ramamoorthy
Affiliations:
University of Edinburgh, Edinburgh, UK;University of Edinburgh, Edinburgh, UK
Venue:
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Year:
2012

Citing 14
Cited 2

Technical Note: \cal Q-Learning

Machine Learning
The dynamics of reinforcement learning in cooperative multiagent systems

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Multiagent learning using a variable learning rate

Artificial Intelligence
Friend-or-Foe Q-learning in General-Sum Games

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Experimental Results on Q-Learning for General-Sum Stochastic Games

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Towards a pareto-optimal solution in general-sum games

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Nash q-learning for general-sum stochastic games

The Journal of Machine Learning Research
Reaching pareto-optimality in prisoner's dilemma using conditional joint action learning

Autonomous Agents and Multi-Agent Systems
On the convergence of stochastic iterative dynamic programming algorithms

Neural Computation
To teach or not to teach?: decision making under uncertainty in ad hoc teams

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Empirical evaluation of ad hoc teamwork in the pursuit domain

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
No free lunch theorems for optimization

IEEE Transactions on Evolutionary Computation
Online planning for ad hoc autonomous agent teams

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume One

A game-theoretic model and best-response learning method for ad hoc coordination in multiagent systems

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Ad hoc coordination in multiagent systems with applications to human-machine interaction

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper is concerned with evaluating different multiagent learning (MAL) algorithms in problems where individual agents may be heterogenous, in the sense of utilizing different learning strategies, without the opportunity for prior agreements or information regarding coordination. Such a situation arises in ad hoc team problems, a model of many practical multiagent systems applications. Prior work in multiagent learning has often been focussed on homogeneous groups of agents, meaning that all agents were identical and a priori aware of this fact. Also, those algorithms that are specifically designed for ad hoc team problems are typically evaluated in teams of agents with fixed behaviours, as opposed to agents which are adapting their behaviours. In this work, we empirically evaluate five MAL algorithms, representing major approaches to multiagent learning but originally developed with the homogeneous setting in mind, to understand their behaviour in a set of ad hoc team problems. All teams consist of agents which are continuously adapting their behaviours. The algorithms are evaluated with respect to a comprehensive characterisation of repeated matrix games, using performance criteria that include considerations such as attainment of equilibrium, social welfare and fairness. Our main conclusion is that there is no clear winner. However, the comparative evaluation also highlights the relative strengths of different algorithms with respect to the type of performance criteria, e.g., social welfare vs. attainment of equilibrium.