Performance bounded reinforcement learning in strategic interactions

Authors:
Bikramjit Banerjee;Jing Peng
Affiliations:
Dept. of Electrical Engineering & Computer Science, Tulane University, New Orleans, LA;Dept. of Electrical Engineering & Computer Science, Tulane University, New Orleans, LA
Venue:
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Year:
2004

Citing 13
Cited 12

Learning to coordinate without sharing information

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
The weighted majority algorithm

Information and Computation
The dynamics of reinforcement learning in cooperative multiagent systems

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Multiagent learning using a variable learning rate

Artificial Intelligence
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Friend-or-Foe Q-learning in General-Sum Games

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
On No-Regret Learning, Fictitious Play, and Nash Equilibrium

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
On Multiagent Q-Learning in a Semi-Competitive Domain

IJCAI '95 Proceedings of the Workshop on Adaption and Learning in Multi-Agent Systems
Nash Convergence of Gradient Dynamics in General-Sum Games

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Gambling in a rigged casino: The adversarial multi-armed bandit problem

FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
R-max - a general polynomial time algorithm for near-optimal reinforcement learning

The Journal of Machine Learning Research
Rational and convergent learning in stochastic games

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2

Efficient learning of multi-step best response

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
On the performance of on-line concurrent reinforcement learners

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
RVσ(t): a unifying approach to performance and convergence in online multiagent learning

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents

Machine Learning
Perspectives on multiagent learning

Artificial Intelligence
Reactivity and Safe Learning in Multi-Agent Systems

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Generalized multiagent learning with performance bound

Autonomous Agents and Multi-Agent Systems
Efficient multi-agent reinforcement learning through automated supervision

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3
Efficient no-regret multiagent learning

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
Anytime Self-play Learning to Satisfy Functional Optimality Criteria

ADT '09 Proceedings of the 1st International Conference on Algorithmic Decision Theory
Unifying convergence and no-regret in multiagent learning

LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems
Multiagent learning in the presence of memory-bounded agents

Autonomous Agents and Multi-Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Despite increasing deployment of agent technologies in several business and industry domains, user confidence in fully automated agent driven applications is noticeably lacking. The main reasons for such lack of trust in complete automation are scalability and nonexistence of reasonable guarantees in the performance of selfadapting software. In this paper we address the latter issue in the context of learning agents in a Multiagent System (MAS). Performance guarantees for most existing on-line Multiagent Learning (MAL) algorithms are realizable only in the limit, thereby seriously limiting its practical utility. Our goal is to provide certain meaningful guarantees about the performance of a learner in a MAS, while it is learning. In particular, we present a novel MAL algorithm that (i) converges to a best response against stationary opponents, (ii) converges to a Nash equilibrium in self-play and (iii) achieves a constant bounded expected regret at any time (no-average-regret asymptotically) in arbitrary sized general-sum games with non-negative payoffs, and against any number of opponents.