Learning to coordinate without sharing information
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
The weighted majority algorithm
Information and Computation
The dynamics of reinforcement learning in cooperative multiagent systems
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Multiagent learning using a variable learning rate
Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Friend-or-Foe Q-learning in General-Sum Games
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
On No-Regret Learning, Fictitious Play, and Nash Equilibrium
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
On Multiagent Q-Learning in a Semi-Competitive Domain
IJCAI '95 Proceedings of the Workshop on Adaption and Learning in Multi-Agent Systems
Nash Convergence of Gradient Dynamics in General-Sum Games
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Gambling in a rigged casino: The adversarial multi-armed bandit problem
FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
R-max - a general polynomial time algorithm for near-optimal reinforcement learning
The Journal of Machine Learning Research
Rational and convergent learning in stochastic games
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Efficient learning of multi-step best response
Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
On the performance of on-line concurrent reinforcement learners
Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
RVσ(t): a unifying approach to performance and convergence in online multiagent learning
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Perspectives on multiagent learning
Artificial Intelligence
Reactivity and Safe Learning in Multi-Agent Systems
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Generalized multiagent learning with performance bound
Autonomous Agents and Multi-Agent Systems
Efficient multi-agent reinforcement learning through automated supervision
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3
Efficient no-regret multiagent learning
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
Anytime Self-play Learning to Satisfy Functional Optimality Criteria
ADT '09 Proceedings of the 1st International Conference on Algorithmic Decision Theory
Unifying convergence and no-regret in multiagent learning
LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems
Multiagent learning in the presence of memory-bounded agents
Autonomous Agents and Multi-Agent Systems
Hi-index | 0.00 |
Despite increasing deployment of agent technologies in several business and industry domains, user confidence in fully automated agent driven applications is noticeably lacking. The main reasons for such lack of trust in complete automation are scalability and nonexistence of reasonable guarantees in the performance of selfadapting software. In this paper we address the latter issue in the context of learning agents in a Multiagent System (MAS). Performance guarantees for most existing on-line Multiagent Learning (MAL) algorithms are realizable only in the limit, thereby seriously limiting its practical utility. Our goal is to provide certain meaningful guarantees about the performance of a learner in a MAS, while it is learning. In particular, we present a novel MAL algorithm that (i) converges to a best response against stationary opponents, (ii) converges to a Nash equilibrium in self-play and (iii) achieves a constant bounded expected regret at any time (no-average-regret asymptotically) in arbitrary sized general-sum games with non-negative payoffs, and against any number of opponents.