Cooperative Multi-Agent Learning: The State of the Art
Autonomous Agents and Multi-Agent Systems
A hierarchy of prescriptive goals for multiagent learning
Artificial Intelligence
On the hardness of finding symmetries in Markov decision processes
Proceedings of the 25th international conference on Machine learning
Heuristiscs-Based High-Level Strategy for Multi-agent Systems
ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
Programming Robosoccer agents by modeling human behavior
Expert Systems with Applications: An International Journal
Opponent Modeling in Adversarial Environments through Learning Ingenuity
Proceedings of the 2005 conference on Self-Organization and Autonomic Informatics (I)
Effective learning in the presence of adaptive counterparts
Journal of Algorithms
The world of independent learners is not markovian
International Journal of Knowledge-based and Intelligent Engineering Systems
Policy invariance under reward transformations for general-sum stochastic games
Journal of Artificial Intelligence Research
Opponent learning for multi-agent system simulation
RSKT'06 Proceedings of the First international conference on Rough Sets and Knowledge Technology
Dealing with errors in a cooperative multi-agent learning system
LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems
Hi-index | 0.00 |
Learning to act in a multiagent environment is a challenging problem. Optimal behavior for one agent depends upon the behavior of the other agents, which are learning as well. Multiagent environments are therefore non-stationary, violating the traditional assumption underlying single-agent learning. In addition, agents in complex tasks may have limitations, such as physical constraints or designer-imposed approximations of the task that make learning tractable. Limitations prevent agents from acting optimally, which complicates the already challenging problem. A learning agent must effectively compensate for its own limitations while exploiting the limitations of the other agents. My thesis research focuses on these two challenges, namely multiagent learning and limitations, and includes four main contributions. First, the thesis introduces the novel concepts of a variable learning rate and the WoLF (Win or Learn Fast) principle to account for other learning agents. The WoLF principle is capable of making rational learning algorithms converge to optimal policies, and by doing so achieves two properties, rationality and convergence, which had not been achieved by previous techniques. The converging effect of WoLF is proven for a class of matrix games, and demonstrated empirically for a wide-range of stochastic games. Second, the thesis contributes an analysis of the effect of limitations on the game-theoretic concept of Nash equilibria. The existence of equilibria is important if multiagent learning techniques, which often depend on the concept, are to be applied to realistic problems where limitations are unavoidable. The thesis introduces a general model for the effect of limitations on agent behavior, which is used to analyze the resulting impact on equilibria. The thesis shows that equilibria do exist for a few restricted classes of games and limitations, but even well-behaved limitations do not preserve the existence of equilibria, in general. Third, the thesis introduces GraWoLF, a general-purpose, scalable, multiagent learning algorithm. GraWoLF combines policy gradient learning techniques with the WoLF variable learning rate. The effectiveness of the learning algorithm is demonstrated in both a card game with an intractably large state space, and an adversarial robot task. These two tasks are complex and agent limitations are prevalent in both. Fourth, the thesis describes the CMDragons robot soccer team strategy for adapting to an unknown opponent. (Abstract shortened by UMI.)