Markov decision processes with multiple long-run average objectives

Authors:
Krishnendu Chatterjee
Affiliations:
UC Berkeley
Venue:
FSTTCS'07 Proceedings of the 27th international conference on Foundations of software technology and theoretical computer science
Year:
2007

Citing 8
Cited 5

Competitive Markov decision processes

Competitive Markov decision processes
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
On the approximability of trade-offs and optimal access of Web sources

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Efficient information gathering on the Internet

FOCS '96 Proceedings of the 37th Annual Symposium on Foundations of Computer Science
Pareto-optimization-based run-time task scheduling for embedded systems

Proceedings of the 1st IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Time-Energy Design Space Exploration for Multi-Layer Memory Architectures

Proceedings of the conference on Design, automation and test in Europe - Volume 1
Multi-objective model checking of Markov decision processes

TACAS'07 Proceedings of the 13th international conference on Tools and algorithms for the construction and analysis of systems
Markov decision processes with multiple objectives

STACS'06 Proceedings of the 23rd Annual conference on Theoretical Aspects of Computer Science

Redundant data transmission in control/estimation over wireless networks

ACC'09 Proceedings of the 2009 conference on American Control Conference
Quantitative multi-objective verification for probabilistic systems

TACAS'11/ETAPS'11 Proceedings of the 17th international conference on Tools and algorithms for the construction and analysis of systems: part of the joint European conferences on theory and practice of software
Synthesis of optimal switching logic for hybrid systems

EMSOFT '11 Proceedings of the ninth ACM international conference on Embedded software
Redundant data transmission in control/estimation over lossy networks

Automatica (Journal of IFAC)
A survey of multi-objective sequential decision-making

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider Markov decision processes (MDPs) with multiple long-run average objectives. Such MDPs occur in design problems where one wishes to simultaneously optimize several criteria, for example, latency and power. The possible trade-offs between the different objectives are characterized by the Pareto curve. We show that every Pareto optimal point can be Ɛ-approximated by a memoryless strategy, for all Ɛ 0. In contrast to the single-objective case, the memoryless strategy may require randomization. We show that the Pareto curve can be approximated (a) in polynomial time in the size of the MDP for irreducible MDPs; and (b) in polynomial space in the size of the MDP for all MDPs. Additionally, we study the problem if a given value vector is realizable by any strategy, and show that it can be decided in polynomial time for irreducible MDPs and in NP for all MDPs. These results provide algorithms for design exploration in MDP models with multiple long-run average objectives.