Rewards for pairs of Q-learning agents conducive to turn-taking in medium-access games

Authors:
Peter A Raffensperger;Philip J Bones;Allan I Mcinnes;Russell Y Webb
Affiliations:
Department of Electrical and Computer Engineering, University of Canterbury, New Zealand;Department of Electrical and Computer Engineering, University of Canterbury, New Zealand;Department of Electrical and Computer Engineering, University of Canterbury, New Zealand;Apple Computer Inc., Cupertino, California, USA
Venue:
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Year:
2012

Citing 19
Cited 0

RoboCup: The Robot World Cup Initiative

AGENTS '97 Proceedings of the first international conference on Autonomous agents
behavioral coordination structural congruence and entrainment in a simulation of acoustically coupled agents

Adaptive Behavior
Multiagent learning using a variable learning rate

Artificial Intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Computer Networks

Computer Networks
Evolving Communication without Dedicated Communication Channels

ECAL '01 Proceedings of the 6th European Conference on Advances in Artificial Life
Cooperation and coordination in the turn-taking dilemma

Proceedings of the 9th conference on Theoretical aspects of rationality and knowledge
Nash q-learning for general-sum stochastic games

The Journal of Machine Learning Research
Adaptability and diversity in simulated turn-taking behavior

Artificial Life
An introduction to ROC analysis

Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Agendas for multi-agent learning

Artificial Intelligence
Learning to communicate in a decentralized environment

Autonomous Agents and Multi-Agent Systems
A leader-follower turn-taking model incorporating beat detection in musical human-robot interaction

Proceedings of the 4th ACM/IEEE international conference on Human robot interaction
Optimizing endpointing thresholds using dialogue features in a spoken dialogue system

SIGdial '08 Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue
Random access game and medium access control design

IEEE/ACM Transactions on Networking (TON)
Learning to compete, coordinate, and cooperate in repeated games using reinforcement learning

Machine Learning
A simple metric for turn-taking in emergent communication

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
A Comprehensive Survey of Multiagent Reinforcement Learning

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe a class of stateful games, which we call 'medium-access games', as a model for human and machine communication and demonstrate how to use the Nash equilibria of those games as played by pairs of agents with stationary policies to predict turn-taking behaviour in Q-learning agents based on the agents' reward function. We identify which fixed policies exhibit turn-taking behaviour in medium-access games and show how to compute the Nash equilibria of such games by using Markov chain methods to calculate the agents' expected rewards for different stationary policies. We present simulation results for an extensive range of reward functions for pairs of Q-learners playing medium-access games and we use our analysis for stationary agents to develop predictors for the emergence of turn-taking. We explain how to use our predictors to design reward functions for pairs of Q-learning agents that are conducive (or prohibitive) to the emergence of turn-taking in medium-access games. We focus on designing multi-agent reinforcement learning systems that deliberately produce coordinated turn-taking but we also intend our results to be useful for analysing emergent turn-taking behaviour. Based on our turn-taking related results, we suggest ways to use our methodology to designs rewards for quantifiable behaviours besides turn-taking.