Nash q-learning for general-sum stochastic games

Authors:
Junling Hu;Michael P. Wellman
Affiliations:
Talkai Research, 843 Roble Ave., 2, Menlo Park, CA;Artificial Intelligence Laboratory, University of Michigan, Ann Arbor, MI
Venue:
The Journal of Machine Learning Research
Year:
2003

Citing 32
Cited 73

Technical Note: \cal Q-Learning

Machine Learning
Incremental multi-step Q-learning

Machine Learning - Special issue on reinforcement learning
Competitive Markov decision processes

Competitive Markov decision processes
The dynamics of reinforcement learning in cooperative multiagent systems

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Fast Online Q(λ)

Machine Learning
Conjectural Equilibrium in Multiagent Learning

Machine Learning
Exploration of Multi-State Environments: Local Measures and Back-Propagation of Uncertainty

Machine Learning
A unified analysis of value-function-based reinforcement learning algorithms

Neural Computation
Convergence Results for Single-Step On-PolicyReinforcement-Learning Algorithms

Machine Learning
A near-optimal polynomial time algorithm for learning in certain classes of stochastic games

Artificial Intelligence
Multiagent learning using a variable learning rate

Artificial Intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Machine Learning

Machine Learning
A multiagent reinforcement learning algorithm using extended optimal response

Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 1
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Friend-or-Foe Q-learning in General-Sum Games

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Scaling Reinforcement Learning toward RoboCup Soccer

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Experimental Results on Q-Learning for General-Sum Stochastic Games

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Pseudo-convergent Q-Learning by Competitive Pricebots

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Multi-agent Q-learning and Regression Trees for Automated Pricing Decisions

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Convergence Problems of General-Sum Multiagent Reinforcement Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Sequential Optimality and Coordination in Multiagent Systems

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Graphical Models for Game Theory

UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
Coordination in multiagent reinforcement learning: a Bayesian approach

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Learning in dynamic noncooperative multiagent systems

Learning in dynamic noncooperative multiagent systems
An application of reinforcement learning to dialogue strategy selection in a spoken dialogue system for email

Journal of Artificial Intelligence Research
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Fast concurrent reinforcement learners

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
R-MAX: a general polynomial time algorithm for near-optimal reinforcement learning

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Multi-agent influence diagrams for representing and solving games

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Value-function reinforcement learning in Markov games

Cognitive Systems Research

Best-Response Multiagent Learning in Non-Stationary Environments

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 2
Learning from Multiple Sources

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Cooperation in stochastic games through communication

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Cooperative Multi-Agent Learning: The State of the Art

Autonomous Agents and Multi-Agent Systems
Learning the task allocation game

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Fuzzy Policy Reinforcement Learning in Cooperative Multi-robot Systems

Journal of Intelligent and Robotic Systems
Multi-agent learning model with bargaining

Proceedings of the 38th conference on Winter simulation
Gradient descent for symmetric and asymmetric multiagent reinforcement learning

Web Intelligence and Agent Systems
If multi-agent learning is the answer, what is the question?

Artificial Intelligence
Perspectives on multiagent learning

Artificial Intelligence
Multi-agent learning for engineers

Artificial Intelligence
Exploring selfish reinforcement learning in repeated games with stochastic rewards

Autonomous Agents and Multi-Agent Systems
Reaching pareto-optimality in prisoner's dilemma using conditional joint action learning

Autonomous Agents and Multi-Agent Systems
A layered approach to learning coordination knowledge in multiagent environments

Applied Intelligence
Generalized multiagent learning with performance bound

Autonomous Agents and Multi-Agent Systems
Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation

Neural Computation
Application of reinforcement learning to the game of Othello

Computers and Operations Research
Solving two-person zero-sum repeated games of incomplete information

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 2
A few good agents: multi-agent social learning

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Interaction-driven Markov games for decentralized multiagent planning under uncertainty

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Competition and Coordination in Stochastic Games

CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Evolutionary stability affected by energy flow in the bio-network architecture

Neurocomputing
An adaptive policy gradient in learning Nash equilibria

Neurocomputing
Learning-Rate Adjusting Q-Learning for Prisoner's Dilemma Games

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
A novel Artificial Neural Network training method combined with Quantum Computational Multi-Agent System theory

International Journal of Intelligent Systems Technologies and Applications
Reinforcement Learning: A Tutorial Survey and Recent Advances

INFORMS Journal on Computing
Stable multi-project scheduling of airport ground handling services by heterogeneous agents

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Multiagent learning in large anonymous games

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Learning-Rate Adjusting Q-Learning for Two-Person Two-Action Symmetric Games

KES-AMSTA '09 Proceedings of the Third KES International Symposium on Agent and Multi-Agent Systems: Technologies and Applications
Dynamic programming for partially observable stochastic games

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Utility based Q-learning to facilitate cooperation in Prisoner's Dilemma games

Web Intelligence and Agent Systems
Strategyproof classification under constant hypotheses: a tale of two functions

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 1
A multiagent reinforcement learning algorithm with non-linear dynamics

Journal of Artificial Intelligence Research
Heuristic selection of actions in multiagent reinforcement learning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Predicting and preventing coordination problems in cooperative Q-learning systems

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Multi-agent based modeling of liver detoxification

SpringSim '09 Proceedings of the 2009 Spring Simulation Multiconference
Effective learning in the presence of adaptive counterparts

Journal of Algorithms
Computing equilibria in multiplayer stochastic games of imperfect information

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Multiagent Reinforcement Learning with Spiking and Non-Spiking Agents in the Iterated Prisoner's Dilemma

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
Anytime Self-play Learning to Satisfy Functional Optimality Criteria

ADT '09 Proceedings of the 1st International Conference on Algorithmic Decision Theory
Nash Q-learning multi-agent flow control for high-speed networks

ACC'09 Proceedings of the 2009 conference on American Control Conference
Research on improvement of model-free average reward reinforcement learning and its simulation experiment

CCDC'09 Proceedings of the 21st annual international conference on Chinese control and decision conference
Optimal convergence in multi-agent MDPs

KES'07/WIRN'07 Proceedings of the 11th international conference, KES 2007 and XVII Italian workshop on neural networks conference on Knowledge-based intelligent information and engineering systems: Part III
Learning multi-agent state space representations

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Coordinated learning in multiagent MDPs with infinite state-space

Autonomous Agents and Multi-Agent Systems
Game theory for cyber security

Proceedings of the Sixth Annual Workshop on Cyber Security and Information Intelligence Research
Case-Based Multiagent Reinforcement Learning: Cases as Heuristics for Selection of Actions

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
From cognition to docition: The teaching radio paradigm for distributed & autonomous deployments

Computer Communications
Social conformity and its convergence for reinforcement learning

MATES'10 Proceedings of the 8th German conference on Multiagent system technologies
Generalized learning automata for multi-agent reinforcement learning

AI Communications - European Workshop on Multi-Agent Systems (EUMAS) 2009
Networks of learning automata and limiting games

ALAMAS'05/ALAMAS'06/ALAMAS'07 Proceedings of the 5th , 6th and 7th European conference on Adaptive and learning agents and multi-agent systems: adaptation and multi-agent learning
Speeding up learning automata based multi agent systems using the concepts of stigmergy and entropy

Expert Systems with Applications: An International Journal
Multiagent learning in large anonymous games

Journal of Artificial Intelligence Research
Theoretical considerations of potential-based reward shaping for multi-agent systems

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Cognitive policy learner: biasing winning or losing strategies

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Agent-based analysis of asset pricing under ambiguous information

Proceedings of the 2011 Workshop on Agent-Directed Simulation
Social welfare for automatic innovation

MATES'11 Proceedings of the 9th German conference on Multiagent system technologies
Pareto-Q learning algorithm for cooperative agents in general-sum games

CEEMAS'05 Proceedings of the 4th international Central and Eastern European conference on Multi-Agent Systems and Applications
A momentum-based approach to learning nash equilibria

PRIMA'06 Proceedings of the 9th Pacific Rim international conference on Agent Computing and Multi-Agent Systems
Meta-game equilibrium for multi-agent reinforcement learning

AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
Multi-agent case-based reasoning for cooperative reinforcement learners

ECCBR'06 Proceedings of the 8th European conference on Advances in Case-Based Reasoning
Unifying convergence and no-regret in multiagent learning

LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems
Solving sparse delayed coordination problems in multi-agent reinforcement learning

ALA'11 Proceedings of the 11th international conference on Adaptive and Learning Agents
A brief introduction to agent mining

Autonomous Agents and Multi-Agent Systems
Rewards for pairs of Q-learning agents conducive to turn-taking in medium-access games

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Comparative evaluation of MAL algorithms in a diverse set of ad hoc team problems

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Continuous strategy replicator dynamics for multi-agent Q-learning

Autonomous Agents and Multi-Agent Systems
Performance of distributed multi-agent multi-state reinforcement spectrum management using different exploration schemes

Expert Systems with Applications: An International Journal
Design with shape grammars and reinforcement learning

Advanced Engineering Informatics
Q-learning Reward Propagation Method for Reducing the Transmission Power of Sensor Nodes in Wireless Sensor Networks

Wireless Personal Communications: An International Journal
Collaborative multi-agent reinforcement learning based on a novel coordination tree frame with dynamic partition

Engineering Applications of Artificial Intelligence
Multiagent meta-level control for radar coordination

Web Intelligence and Agent Systems
Exploration strategies in n-Person general-sum multiagent reinforcement learning with sequential action selection

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

We extend Q-learning to a noncooperative multiagent context, using the framework of general-sum stochastic games. A learning agent maintains Q-functions over joint actions, and performs updates based on assuming Nash equilibrium behavior over the current Q-values. This learning protocol provably converges given certain restrictions on the stage games (defined by Q-values) that arise during learning. Experiments with a pair of two-player grid games suggest that such restrictions on the game structure are not necessarily required. Stage games encountered during learning in both grid environments violate the conditions. However, learning consistently converges in the first grid game, which has a unique equilibrium Q-function, but sometimes fails to converge in the second, which has three different equilibrium Q-functions. In a comparison of offline learning performance in both games, we find agents are more likely to reach a joint optimal path with Nash Q-learning than with a single-agent Q-learning method. When at least one agent adopts Nash Q-learning, the performance of both agents is better than using single-agent Q-learning. We have also implemented an online version of Nash Q-learning that balances exploration with exploitation, yielding improved performance.