Fast concurrent reinforcement learners

Authors:
Bikramjit Banerjee;Sandip Sen;Jing Peng
Affiliations:
Math & Computer Sciences Department, University of Tulsa, Tulsa, OK;Math & Computer Sciences Department, University of Tulsa, Tulsa, OK;Dept. of Computer Science, Oklahoma State University, Stillwater, OK
Venue:
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Year:
2001

Citing 7
Cited 12

Incremental multi-step Q-learning

Machine Learning - Special issue on reinforcement learning
Fast Online Q(λ)

Machine Learning
A unified analysis of value-function-based reinforcement learning algorithms

Neural Computation
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
On Multiagent Q-Learning in a Semi-Competitive Domain

IJCAI '95 Proceedings of the Workshop on Adaption and Learning in Multi-Agent Systems
An Introduction to Linear Programming and Game Theory

An Introduction to Linear Programming and Game Theory

Experience generalization for concurrent reinforcement learners: the minimax-QS algorithm

Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 3
Convergent Gradient Ascent in General-Sum Games

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Adaptive policy gradient in multiagent learning

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Nash q-learning for general-sum stochastic games

The Journal of Machine Learning Research
Theory of moves learners: towards non-myopic equilibria

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents

Machine Learning
Perspectives on multiagent learning

Artificial Intelligence
Reactivity and Safe Learning in Multi-Agent Systems

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Heuristic selection of actions in multiagent reinforcement learning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Case-Based Multiagent Reinforcement Learning: Cases as Heuristics for Selection of Actions

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
The success and failure of tag-mediated evolution of cooperation

LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems
Multiagent learning in the presence of memory-bounded agents

Autonomous Agents and Multi-Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

When several agents learn concurrently, the payoff received by an agent is dependent on the behavior of the other agents. As the other agents learn, the reward of one agent becomes non-stationary. This makes learning in multiagent systemsmore difficult than single-agent learning. A few methods, how-ever, are known to guarantee convergence to equilibrium in the limit in such systems. In this paper we experimentally study one such technique, the minimax-Q, in a competitive domain and prove its equivalence with another well-known method for competitive domains. We study the rate of convergence of minimax-Q and investigate possible ways for increasing the same. We also present a variant of the algorithm, minimax-SARSA, and prove its convergence to minimax-Q values under appropriate conditions. Finally we show that this new algorithm performs better than simple minimax-Q in a general-sum domain as well.