Efficient no-regret multiagent learning

Authors:
Bikramjit Banerjee;Jing Peng
Affiliations:
Dept. of Electrical Engineering & Computer Science, Tulane University, New Orleans, LA;Dept. of Electrical Engineering & Computer Science, Tulane University, New Orleans, LA
Venue:
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
Year:
2005

Citing 10
Cited 7

The weighted majority algorithm

Information and Computation
The dynamics of reinforcement learning in cooperative multiagent systems

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Multiagent learning using a variable learning rate

Artificial Intelligence
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Friend-or-Foe Q-learning in General-Sum Games

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
On No-Regret Learning, Fictitious Play, and Nash Equilibrium

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Nash Convergence of Gradient Dynamics in General-Sum Games

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Online convex optimization in the bandit setting: gradient descent without a gradient

SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Performance bounded reinforcement learning in strategic interactions

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence

On the performance of on-line concurrent reinforcement learners

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Multi-agent learning model with bargaining

Proceedings of the 38th conference on Winter simulation
A general criterion and an algorithmic framework for learning in multi-agent systems

Machine Learning
If multi-agent learning is the answer, what is the question?

Artificial Intelligence
Regret based dynamics: convergence in weakly acyclic games

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Exploiting based pre-testing in competition environment

PRIMA'06 Proceedings of the 9th Pacific Rim international conference on Agent Computing and Multi-Agent Systems
No regret learning for sensor relocation in mobile sensor networks

ICICA'11 Proceedings of the Second international conference on Information Computing and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present new results on the efficiency of no-regret algorithmsin the context of multiagent learning. We use a known approach to augment a large class of no-regret algorithms to allow stochastic sampling of actions and observation of scalar reward of only the action played. We show that the average actual payoffs of the resulting learner gets (1) close to the best response against (eventually) stationary opponents. (2) close to the asymptotic optimal payoff against opponents that playa converging sequence of policies. and (3) close to at least a dynamic variant of minimax payoff against arbitrary opponents. with a high probability in polynomial time. In addition the polynomial bounds are shown to be significantly better than previously known bounds. Furthermore, we do not need to assume that the learner knows the game matrices and can observe the opponents' actions, unlike previous work.