Markov decision processes with multiple objectives

Authors:
Krishnendu Chatterjee;Rupak Majumdar;Thomas A. Henzinger
Affiliations:
UC Berkeley;UC Los Angeles;UC Berkeley
Venue:
STACS'06 Proceedings of the 23rd Annual conference on Theoretical Aspects of Computer Science
Year:
2006

Citing 5
Cited 17

Competitive Markov decision processes

Competitive Markov decision processes
On the approximability of trade-offs and optimal access of Web sources

FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Efficient information gathering on the Internet

FOCS '96 Proceedings of the 37th Annual Symposium on Foundations of Computer Science
Pareto-optimization-based run-time task scheduling for embedded systems

Proceedings of the 1st IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Time-Energy Design Space Exploration for Multi-Layer Memory Architectures

Proceedings of the conference on Design, automation and test in Europe - Volume 1

Succinct approximate convex pareto curves

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Controller Synthesis and Verification for Markov Decision Processes with Qualitative Branching Time Objectives

ICALP '08 Proceedings of the 35th international colloquium on Automata, Languages and Programming, Part II
Pure stationary optimal strategies in Markov decision processes

STACS'07 Proceedings of the 24th annual conference on Theoretical aspects of computer science
Multi-objective model checking of Markov decision processes

TACAS'07 Proceedings of the 13th international conference on Tools and algorithms for the construction and analysis of systems
Markov decision processes with multiple long-run average objectives

FSTTCS'07 Proceedings of the 27th international conference on Foundations of software technology and theoretical computer science
On Finding Compromise Solutions in Multiobjective Markov Decision Processes

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Quantitative multi-objective verification for probabilistic systems

TACAS'11/ETAPS'11 Proceedings of the 17th international conference on Tools and algorithms for the construction and analysis of systems: part of the joint European conferences on theory and practice of software
On minimizing ordered weighted regrets in multiobjective Markov decision processes

ADT'11 Proceedings of the Second international conference on Algorithmic decision theory
SAVES: a sustainable multiagent application to conserve building energy considering occupants

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Sustainable multiagent application to conserve energy (demonstration)

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Playing stochastic games precisely

CONCUR'12 Proceedings of the 23rd international conference on Concurrency Theory
Pareto curves for probabilistic model checking

ATVA'12 Proceedings of the 10th international conference on Automated Technology for Verification and Analysis
A temporal logic with mean-payoff constraints

ICFEM'12 Proceedings of the 14th international conference on Formal Engineering Methods: formal methods and software engineering
Cost preserving bisimulations for probabilistic automata

CONCUR'13 Proceedings of the 24th international conference on Concurrency Theory
Synthesis for multi-objective stochastic games: an application to autonomous urban driving

QEST'13 Proceedings of the 10th international conference on Quantitative Evaluation of Systems
A survey of multi-objective sequential decision-making

Journal of Artificial Intelligence Research
Trading Performance for Stability in Markov Decision Processes

LICS '13 Proceedings of the 2013 28th Annual ACM/IEEE Symposium on Logic in Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider Markov decision processes (MDPs) with multiple discounted reward objectives. Such MDPs occur in design problems where one wishes to simultaneously optimize several criteria, for example, latency and power. The possible trade-offs between the different objectives are characterized by the Pareto curve. We show that every Pareto-optimal point can be achieved by a memoryless strategy; however, unlike in the single-objective case, the memoryless strategy may require randomization. Moreover, we show that the Pareto curve can be approximated in polynomial time in the size of the MDP. Additionally, we study the problem if a given value vector is realizable by any strategy, and show that it can be decided in polynomial time; but the question whether it is realizable by a deterministic memoryless strategy is NP-complete. These results provide efficient algorithms for design exploration in MDP models with multiple objectives.