Perspectives on multiagent learning

Authors:
Tuomas Sandholm
Affiliations:
Carnegie Mellon University, Computer Science Department, Pittsburgh, PA 15213, USA
Venue:
Artificial Intelligence
Year:
2007

Citing 42
Cited 9

Learning automata: an introduction

Learning automata: an introduction
The complexity of eliminating dominated strategies

Mathematics of Operations Research
The weighted majority algorithm

Information and Computation
Temporal difference learning and TD-Gammon

Communications of the ACM
Towards collaborative and adversarial learning:: a case study in robotic soccer

International Journal of Human-Computer Studies - Evolution and learning in multiagent systems
The dynamics of reinforcement learning in cooperative multiagent systems

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
A near-optimal polynomial time algorithm for learning in certain classes of stochastic games

Artificial Intelligence
Multiagent learning using a variable learning rate

Artificial Intelligence
The Nonstochastic Multiarmed Bandit Problem

SIAM Journal on Computing
Friend-or-Foe Q-learning in General-Sum Games

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
On No-Regret Learning, Fictitious Play, and Nash Equilibrium

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Nash Convergence of Gradient Dynamics in General-Sum Games

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Graphical Models for Game Theory

UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
Polynomial-time reinforcement learning of near-optimal policies

Eighteenth national conference on Artificial intelligence
Multi-agent algorithms for solving graphical games

Eighteenth national conference on Artificial intelligence
The empirical Bayes envelope and regret minimization in competitive Markov decision processes

Mathematics of Operations Research
R-max - a general polynomial time algorithm for near-optimal reinforcement learning

The Journal of Machine Learning Research
Nash q-learning for general-sum stochastic games

The Journal of Machine Learning Research
Communication complexity as a lower bound for learning in games

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Efficient learning equilibrium

Artificial Intelligence
Complexity of (iterated) dominance

Proceedings of the 6th ACM conference on Electronic commerce
Computing equilibria in multi-player games

SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Online convex optimization in the bandit setting: gradient descent without a gradient

SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Mobilized Ad-Hoc Networks: A Reinforcement Learning Approach

ICAC '04 Proceedings of the First International Conference on Autonomic Computing
On the Complexity of Two-PlayerWin-Lose Games

FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
Efficient algorithms for online decision problems

Journal of Computer and System Sciences - Special issue: Learning theory 2003
Computing the optimal strategy to commit to

EC '06 Proceedings of the 7th ACM conference on Electronic commerce
Finding equilibria in large sequential games of imperfect information

EC '06 Proceedings of the 7th ACM conference on Electronic commerce
Routing without regret: on convergence to nash equilibria of regret-minimizing algorithms in routing games

Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing
AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents

Machine Learning
Performance bounded reinforcement learning in strategic interactions

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Simple search methods for finding a Nash equilibrium

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Algorithms for rationalizability and CURB sets

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
A generalized strategy eliminability criterion and computational methods for applying it

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Mixed-integer programming methods for finding Nash equilibria

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Optimal efficient learning equilibrium: imperfect monitoring in symmetric games

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Complexity results about Nash equilibria

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Fast concurrent reinforcement learners

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Solving checkers

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Learning against opponents with bounded memory

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Value-function reinforcement learning in Markov games

Cognitive Systems Research
A note on strategy elimination in bimatrix games

Operations Research Letters

Solving two-person zero-sum repeated games of incomplete information

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 2
Multidimensional screening: online computation and limited information

Proceedings of the 10th international conference on Electronic commerce
Making Allocations Collectively: Iterative Group Decision Making under Uncertainty

MATES '08 Proceedings of the 6th German conference on Multiagent System Technologies
Task-technology fit and user acceptance of online auction

International Journal of Human-Computer Studies
Towards a taxonomy of decision making problems in multi-agent systems

MATES'09 Proceedings of the 7th German conference on Multiagent system technologies
Approximation guarantees for fictitious play

Allerton'09 Proceedings of the 47th annual Allerton conference on Communication, control, and computing
The world of independent learners is not markovian

International Journal of Knowledge-based and Intelligent Engineering Systems
Game theory-based opponent modeling in large imperfect-information games

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Safe opponent exploitation

Proceedings of the 13th ACM Conference on Electronic Commerce

Quantified Score

Hi-index	0.00

Visualization

Abstract

I lay out a slight refinement of Shoham et al.'s taxonomy of agendas that I consider sensible for multiagent learning (MAL) research. It is not intended to be rigid: senseless work can be done within these agendas and additional sensible agendas may arise. Within each agenda, I identify issues and suggest directions. In the computational agenda, direct algorithms are often more efficient, but MAL plays a role especially when the rules of the game are unknown or direct algorithms are not known for the class of games. In the descriptive agenda, more emphasis should be placed on establishing what classes of learning rules actually model learning by multiple humans or animals. Also, the agenda is, in a way, circular. This has a positive side too: it can be used to verify the learning models. In the prescriptive agendas, the desiderata need to be made clear and should guide the design of MAL algorithms. The algorithms need not mimic humans' or animals' learning. I discuss some worthy desiderata; some from the literature do not seem well motivated. The learning problem is interesting both in cooperative and noncooperative settings, but the concerns are quite different. For many, if not most, noncooperative settings, future work should increasingly consider the learning itself strategically. Lower bounds cut across the agendas. They can be derived on the computational complexity and on the number of interactions needed.