Dynamic programming for partially observable stochastic games

Authors:
Eric A. Hansen;Daniel S. Bernstein;Shlomo Zilberstein
Affiliations:
Dept. of Computer Science and Engineering, Mississippi State University, Mississippi State, MS;Department of Computer Science, University of Massachusetts, Amherst, MA;Department of Computer Science, University of Massachusetts, Amherst, MA
Venue:
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Year:
2004

Citing 14
Cited 62

Fast algorithms for finding randomized strategies in game trees

STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
Competitive Markov decision processes

Competitive Markov decision processes
Representations and solutions for game-theoretic problems

Artificial Intelligence - Special issue on economic principles of multi-agent systems
Planning and acting in partially observable stochastic domains

Artificial Intelligence
The Complexity of Decentralized Control of Markov Decision Processes

Mathematics of Operations Research
Sequential Optimality and Coordination in Multiagent Systems

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Learning to Cooperate via Policy Search

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Fast Planning in Stochastic Games

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Transition-independent decentralized markov decision processes

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
R-max - a general polynomial time algorithm for near-optimal reinforcement learning

The Journal of Machine Learning Research
Nash q-learning for general-sum stochastic games

The Journal of Machine Learning Research
Taming decentralized POMDPs: towards efficient policy computation for multiagent settings

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Solving POMDPs by searching in policy space

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence

Complexity of (iterated) dominance

Proceedings of the 6th ACM conference on Electronic commerce
Reasoning about joint beliefs for execution-time communication decisions

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Game theoretic Golog under partial observability

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Playing games in many possible worlds

EC '06 Proceedings of the 7th ACM conference on Electronic commerce
Decentralized planning under uncertainty for teams of communicating agents

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Security in multiagent systems by policy randomization

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Winning back the CUP for distributed POMDPs: planning over continuous belief spaces

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Stochastic planning for weakly-coupled distributed agents

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Selecting informative actions improves cooperative multiagent learning

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Exact solutions of interactive POMDPs using behavioral equivalence

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Collaborative Multiagent Reinforcement Learning by Payoff Propagation

The Journal of Machine Learning Research
Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation

Neural Computation
Letting loose a SPIDER on a network of POMDPs: generating quality guaranteed policies

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Q-value functions for decentralized POMDPs

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Subjective approximate solutions for decentralized POMDPs

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Not all agents are equal: scaling up distributed POMDPs for agent networks

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Value-based observation compression for DEC-POMDPs

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Exploiting locality of interaction in factored Dec-POMDPs

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Solving Decentralized Continuous Markov Decision Problems with Structured Reward

KI '07 Proceedings of the 30th annual German conference on Advances in Artificial Intelligence
Constraint-based dynamic programming for decentralized POMDPs with structured interactions

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Point-based incremental pruning heuristic for solving finite-horizon DEC-POMDPs

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Lossless clustering of histories in decentralized POMDPs

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Achieving goals in decentralized POMDPs

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
On the difficulty of achieving equilibrium in interactive POMDPs

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Point-based dynamic programming for DEC-POMDPs

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Networked distributed POMDPs: a synthesis of distributed constraint optimization and POMDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
Agent influence as a predictor of difficulty for decentralized problem-solving

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Decentralized control of cooperative systems: categorization and complexity analysis

Journal of Artificial Intelligence Research
Solving transition independent decentralized Markov decision processes

Journal of Artificial Intelligence Research
Hybrid BDI-POMDP framework for multiagent teaming

Journal of Artificial Intelligence Research
Optimal and approximate Q-value functions for decentralized POMDPs

Journal of Artificial Intelligence Research
Policy iteration for decentralized control of Markov decision processes

Journal of Artificial Intelligence Research
Memory-bounded dynamic programming for DEC-POMDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Agent Influence and Intelligent Approximation in Multiagent Problems

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Bounded policy iteration for decentralized POMDPs

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Networked distributed POMDPs: a synergy of distributed constraint optimization and POMDPs

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Review article: Synergizing reinforcement learning and game theory-A new direction for control

Applied Soft Computing
Game-theoretic agent programming in Golog under partial observability

KI'06 Proceedings of the 29th annual German conference on Artificial intelligence
Heuristic search for identical payoff Bayesian games

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Point-based policy generation for decentralized POMDPs

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs

Autonomous Agents and Multi-Agent Systems
Game theory for cyber security

Proceedings of the Sixth Annual Workshop on Cyber Security and Information Intelligence Research
An investigation into mathematical programming for finite horizon decentralized POMDPs

Journal of Artificial Intelligence Research
Online planning for multi-agent systems with bounded communication

Artificial Intelligence
Planning in stochastic domains for multiple agents with individual continuous resource state-spaces

Autonomous Agents and Multi-Agent Systems
Toward error-bounded algorithms for infinite-horizon DEC-POMDPs

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Partially observable stochastic game-based multi-agent prediction markets

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Solving efficiently Decentralized MDPs with temporal and resource constraints

Autonomous Agents and Multi-Agent Systems
On the power of global reward signals in reinforcement learning

MATES'11 Proceedings of the 9th German conference on Multiagent system technologies
Game-theoretic reasoning about actions in nonmonotonic causal theories

LPNMR'05 Proceedings of the 8th international conference on Logic Programming and Nonmonotonic Reasoning
Coordinating teams in uncertain environments: a hybrid BDI-POMDP approach

ProMAS'04 Proceedings of the Second international conference on Programming Multi-Agent Systems
An optimal best-first search algorithm for solving infinite horizon DEC-POMDPs

ECML'05 Proceedings of the 16th European conference on Machine Learning
Exploiting symmetries for single- and multi-agent Partially Observable Stochastic Domains

Artificial Intelligence
A POMDP model for guiding taxi cruising in a congested urban city

MICAI'11 Proceedings of the 10th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
A multi-agent prediction market based on partially observable stochastic game

Proceedings of the 13th International Conference on Electronic Commerce
Modeling information exchange opportunities for effective human-computer teamwork

Artificial Intelligence
Solving decentralized POMDP problems using genetic algorithms

Autonomous Agents and Multi-Agent Systems
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

Journal of Artificial Intelligence Research
Optimally solving dec-POMDPs as continuous-state MDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Sufficient plan-time statistics for decentralized POMDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Monte-Carlo expectation maximization for decentralized POMDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Interactive POMDP lite: towards practical planning to predict and exploit intentions for interacting with self-interested agents

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We develop an exact dynamic programming algorithm for partially observable stochastic games (POSGs). The algorithm is a synthesis of dynamic programming for partially observable Markov decision processes (POMDPs) and iterated elimination or dominated strategies in normal form games. We prove that when applied to finite-horizon POSGs, the algorithm iteratively eliminates very weakly dominated strategies without first forming a normal form representation of the game. For the special case in which agents share the same payoffs, the algorithm can be used to find an optimal solution. We present preliminary empirical results and discuss ways to further exploit POMDP theory in solving POSGs.