Multiagent learning in the presence of agents with limitations

Authors:
Michael Bowling;Manuela Veloso
Affiliations:
-;-
Venue:
Multiagent learning in the presence of agents with limitations
Year:
2003

Citing 0
Cited 11

Cooperative Multi-Agent Learning: The State of the Art

Autonomous Agents and Multi-Agent Systems
A hierarchy of prescriptive goals for multiagent learning

Artificial Intelligence
On the hardness of finding symmetries in Markov decision processes

Proceedings of the 25th international conference on Machine learning
Heuristiscs-Based High-Level Strategy for Multi-agent Systems

ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
Programming Robosoccer agents by modeling human behavior

Expert Systems with Applications: An International Journal
Opponent Modeling in Adversarial Environments through Learning Ingenuity

Proceedings of the 2005 conference on Self-Organization and Autonomic Informatics (I)
Effective learning in the presence of adaptive counterparts

Journal of Algorithms
The world of independent learners is not markovian

International Journal of Knowledge-based and Intelligent Engineering Systems
Policy invariance under reward transformations for general-sum stochastic games

Journal of Artificial Intelligence Research
Opponent learning for multi-agent system simulation

RSKT'06 Proceedings of the First international conference on Rough Sets and Knowledge Technology
Dealing with errors in a cooperative multi-agent learning system

LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Learning to act in a multiagent environment is a challenging problem. Optimal behavior for one agent depends upon the behavior of the other agents, which are learning as well. Multiagent environments are therefore non-stationary, violating the traditional assumption underlying single-agent learning. In addition, agents in complex tasks may have limitations, such as physical constraints or designer-imposed approximations of the task that make learning tractable. Limitations prevent agents from acting optimally, which complicates the already challenging problem. A learning agent must effectively compensate for its own limitations while exploiting the limitations of the other agents. My thesis research focuses on these two challenges, namely multiagent learning and limitations, and includes four main contributions. First, the thesis introduces the novel concepts of a variable learning rate and the WoLF (Win or Learn Fast) principle to account for other learning agents. The WoLF principle is capable of making rational learning algorithms converge to optimal policies, and by doing so achieves two properties, rationality and convergence, which had not been achieved by previous techniques. The converging effect of WoLF is proven for a class of matrix games, and demonstrated empirically for a wide-range of stochastic games. Second, the thesis contributes an analysis of the effect of limitations on the game-theoretic concept of Nash equilibria. The existence of equilibria is important if multiagent learning techniques, which often depend on the concept, are to be applied to realistic problems where limitations are unavoidable. The thesis introduces a general model for the effect of limitations on agent behavior, which is used to analyze the resulting impact on equilibria. The thesis shows that equilibria do exist for a few restricted classes of games and limitations, but even well-behaved limitations do not preserve the existence of equilibria, in general. Third, the thesis introduces GraWoLF, a general-purpose, scalable, multiagent learning algorithm. GraWoLF combines policy gradient learning techniques with the WoLF variable learning rate. The effectiveness of the learning algorithm is demonstrated in both a card game with an intractably large state space, and an adversarial robot task. These two tasks are complex and agent limitations are prevalent in both. Fourth, the thesis describes the CMDragons robot soccer team strategy for adapting to an unknown opponent. (Abstract shortened by UMI.)