MB-AIM-FSI: a model based framework for exploiting gradient ascent multiagent learners in strategic interactions

Authors:
Doran Chakraborty;Sandip Sen
Affiliations:
University of Texas, Austin, Austin, Texas;University of Tulsa, Tulsa, Oklahoma
Venue:
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Year:
2008

Citing 7
Cited 1

Technical Note: \cal Q-Learning

Machine Learning
Machine Learning

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Convergence of Gradient Dynamics with a Variable Learning Rate

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
RVσ(t): a unifying approach to performance and convergence in online multiagent learning

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Rational and convergent learning in stochastic games

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Learning against opponents with bounded memory

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence

Effective learning in the presence of adaptive counterparts

Journal of Algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

Future agent applications will increasingly represent human users autonomously or semi-autonomously in strategic interactions with similar entities. Hence, there is a growing need to develop algorithmic approaches that can learn to recognize commonalities in opponent strategies and exploit such commonalities to improve strategic response. Recently a framework [9] has been proposed that aims for targeted optimality against a set of finite memory opponents. We propose an approach that aims for targeted optimality against the set of all possible multiagent learning algorithms that perform gradient search to select a single stage Nash Equilibria of a repeated game. Such opponents induce a Markov Decision Process as the learning environment and appropriate responses to such environments are learned by assuming a generative model of the environment. In the absence of a generative model, we present a framework, MB-AIM-FSI, that models the opponent online based on interactions, solves the model off-line when sufficient information has been gathered, stores the strategy in the repository and finally uses it judiciously when playing against the same or similar opponent at a later time.