Average-reward decentralized Markov decision processes

Authors:
Marek Petrik;Shlomo Zilberstein
Affiliations:
Department of Computer Science, University of Massachusetts, Amherst, MA;Department of Computer Science, University of Massachusetts, Amherst, MA
Venue:
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Year:
2007

Citing 9
Cited 6

Discrete-time controlled Markov processes with average cost criterion: a survey

SIAM Journal on Control and Optimization
Average reward reinforcement learning: foundations, algorithms, and empirical results

Machine Learning - Special issue on reinforcement learning
Competitive Markov decision processes

Competitive Markov decision processes
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
The Complexity of Decentralized Control of Markov Decision Processes

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Complexity results for infinite-horizon markov decision processes

Complexity results for infinite-horizon markov decision processes
Mixed-integer programming methods for finding Nash equilibria

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Solving transition independent decentralized Markov decision processes

Journal of Artificial Intelligence Research
Taming decentralized POMDPs: towards efficient policy computation for multiagent settings

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

A unification of extensive-form games and Markov decision processes

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Anytime coordination using separable bilinear programs

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
A bilinear programming approach for multiagent planning

Journal of Artificial Intelligence Research
Self-organization for coordinating decentralized reinforcement learning

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs

Autonomous Agents and Multi-Agent Systems
An investigation into mathematical programming for finite horizon decentralized POMDPs

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Formal analysis of decentralized decision making has become a thriving research area in recent years, producing a number of multi-agent extensions of Markov decision processes. While much of the work has focused on optimizing discounted cumulative reward, optimizing average reward is sometimes a more suitable criterion. We formalize a class of such problems and analyze its characteristics, showing that it is NP complete and that optimal policies are deterministic. Our analysis lays the foundation for designing two optimal algorithms. Experimental results with a standard problem from the literature illustrate the applicability of these solution techniques.