Cooperating with a markovian ad hoc teammate

Authors:
Doran Chakraborty;Peter Stone
Affiliations:
Microsoft, Sunnyvale, CA, USA;The University of Texas, Austin, Austin, TX, USA
Venue:
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Year:
2013

Citing 16
Cited 0

Average reward reinforcement learning: foundations, algorithms, and empirical results

Machine Learning - Special issue on reinforcement learning
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Towards Flexible Teamwork in Persistent Teams: Extended Report

Autonomous Agents and Multi-Agent Systems
Near-Optimal Reinforcement Learning in Polynominal Time

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
R-max - a general polynomial time algorithm for near-optimal reinforcement learning

The Journal of Machine Learning Research
A general criterion and an algorithmic framework for learning in multi-agent systems

Machine Learning
Online Multiagent Learning against Memory Bounded Adversaries

ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Coordination and adaptation in impromptu teams

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
Efficient structure learning in factored-state MDPs

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Robust agent teams via socially-attentive monitoring

Journal of Artificial Intelligence Research
Learning against opponents with bounded memory

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
To teach or not to teach?: decision making under uncertainty in ad hoc teams

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Empirical evaluation of ad hoc teamwork in the pursuit domain

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Online planning for ad hoc autonomous agent teams

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume One
Leading ad hoc agents in joint action settings with multiple teammates

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper focuses on learning in the presence of a Markovian teammate in Ad hoc teams. A Markovian teammate's policy is a function of a set of discrete feature values derived from the joint history of interaction, where the feature values transition in a Markovian fashion on each time step. We introduce a novel algorithm "Learning to Cooperate with a Markovian teammate", or LCM, that converges to optimal cooperation with any Markovian teammate, and achieves safety with any arbitrary teammate. The novel aspect of LCM is the manner in which it satisfies the above two goals via efficient exploration and exploitation. The main contribution of this paper is a full specification and a detailed analysis of LCM's theoretical properties.1