A survey of multi-objective sequential decision-making

Authors:
Diederik M. Roijers;Peter Vamplew;Shimon Whiteson;Richard Dazeley
Affiliations:
Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands;School of Science, Information Technology and Engineering, University of Ballarat, Ballarat, Australia;Informatics Institute, University of Amsterdam, Amsterdam, The Netherlands;School of Science, Information Technology and Engineering, University of Ballarat, Ballarat, Australia
Venue:
Journal of Artificial Intelligence Research
Year:
2013

Citing 65
Cited 0

Constrained Markov decision models with weighted discounted rewards

Mathematics of Operations Research
Learning to solve multiple goals

Learning to solve multiple goals
On the undecidability of probabilistic planning and infinite-horizon partially observable Markov decision problems

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Evolutionary Algorithms for Solving Multi-Objective Problems

Evolutionary Algorithms for Solving Multi-Objective Problems
Reinforcement Learning for Call Admission Control and Routing under Quality of Service Constraints in Multimedia Networks

Machine Learning
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Off-Policy Temporal Difference Learning with Function Approximation

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Multi-criteria Reinforcement Learning

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Distributed Value Functions

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Reinforcement Learning with Bounded Risk

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A New Distributed Reinforcement Learning Algorithm for Multiple Objective Optimization Problems

IBERAMIA-SBIA '00 Proceedings of the International Joint Conference, 7th Ibero-American Conference on AI: Advances in Artificial Intelligence
A New Approach for the Solution of Multiple Objective Optimization Problems Based on Reinforcement Learning

MICAI '00 Proceedings of the Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Transition-independent decentralized markov decision processes

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
A Geometric Approach to Multi-Criterion Reinforcement Learning

The Journal of Machine Learning Research
Tree-Based Batch Mode Reinforcement Learning

The Journal of Machine Learning Research
Dynamic preferences in multi-criteria reinforcement learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
Collective Multi-Objective Planning

DIS '06 Proceedings of the IEEE Workshop on Distributed Intelligent Systems: Collective Intelligence and Its Applications
Water reservoir control under economic, social and environmental constraints

Automatica (Journal of IFAC)
Learning all optimal policies with multiple criteria

Proceedings of the 25th international conference on Machine learning
On the Limitations of Scalarisation for Multi-objective Reinforcement Learning of Pareto Fronts

AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Responsive elastic computing

GMAC '09 Proceedings of the 6th international conference industry session on Grids meets autonomic computing
EDA-RL: estimation of distribution algorithms for reinforcement learning problems

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Dynamic scheduling of maintenance tasks in the petroleum industry: A reinforcement approach

Engineering Applications of Artificial Intelligence
An evolutionary algorithm with advanced goal and priority specification for multi-objective optimization

Journal of Artificial Intelligence Research
Risk-sensitive reinforcement learning applied to control under constraints

Journal of Artificial Intelligence Research
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Anytime point-based approximations for large POMDPs

Journal of Artificial Intelligence Research
Online planning algorithms for POMDPs

Journal of Artificial Intelligence Research
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Solving Multi-objective Reinforcement Learning Problems by EDA-RL - Acquisition of Various Strategies

ISDA '09 Proceedings of the 2009 Ninth International Conference on Intelligent Systems Design and Applications
Distributed W-Learning: Multi-Policy Optimization in Self-Organizing Systems

SASO '09 Proceedings of the 2009 Third IEEE International Conference on Self-Adaptive and Self-Organizing Systems
Constructing Stochastic Mixture Policies for Episodic Multiobjective Reinforcement Learning Tasks

AI '09 Proceedings of the 22nd Australasian Joint Conference on Advances in Artificial Intelligence
Learning controllers for human-robot interaction

Learning controllers for human-robot interaction
Markov decision processes with multiple long-run average objectives

FSTTCS'07 Proceedings of the 27th international conference on Foundations of software technology and theoretical computer science
Planning under Uncertainty for Robotic Tasks with Mixed Observability

International Journal of Robotics Research
On Finding Compromise Solutions in Multiobjective Markov Decision Processes

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Enhanced Q-learning algorithm for dynamic power management with performance constraint

Proceedings of the Conference on Design, Automation and Test in Europe
Multi-policy optimization in self-organizing systems

SOAR'09 Proceedings of the First international conference on Self-organizing architectures
Multiobjective reinforcement learning for traffic signal control using vehicular ad hoc network

EURASIP Journal on Advances in Signal Processing - Special title on vehicular ad hoc networks
Quantitative multi-objective verification for probabilistic systems

TACAS'11/ETAPS'11 Proceedings of the 17th international conference on Tools and algorithms for the construction and analysis of systems: part of the joint European conferences on theory and practice of software
Monte-Carlo tree search and rapid action value estimation in computer Go

Artificial Intelligence
Evolving policies for multi-reward partially observable markov decision processes (MR-POMDPs)

Proceedings of the 13th annual conference on Genetic and evolutionary computation
Policy search for motor primitives in robotics

Machine Learning
Empirical evaluation methods for multiobjective reinforcement learning algorithms

Machine Learning
Informing sequential clinical decision-making through reinforcement learning: an empirical study

Machine Learning
On minimizing ordered weighted regrets in multiobjective Markov decision processes

ADT'11 Proceedings of the Second international conference on Algorithmic decision theory
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Bandit based monte-carlo planning

ECML'06 Proceedings of the 17th European conference on Machine Learning
Reinforcement learning for MDPs with constraints

ECML'06 Proceedings of the 17th European conference on Machine Learning
Markov decision processes with multiple objectives

STACS'06 Proceedings of the 23rd Annual conference on Theoretical Aspects of Computer Science
Many-Objective optimization: an engineering design perspective

EMO'05 Proceedings of the Third international conference on Evolutionary Multi-Criterion Optimization
A Comprehensive Survey of Multiagent Reinforcement Learning

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Performance assessment of multiobjective optimizers: an analysis and review

IEEE Transactions on Evolutionary Computation
A reinforcement neuro-fuzzy combiner for multiobjective control

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Influence diagrams with multiple objectives and tradeoff analysis

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
SAVES: a sustainable multiagent application to conserve building energy considering occupants

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Stochastic Pareto local search: Pareto neighbourhood exploration and perturbation strategies

Journal of Heuristics
Pareto curves for probabilistic model checking

ATVA'12 Proceedings of the 10th international conference on Automated Technology for Verification and Analysis
Multi-agent Multi-objective Learning Using Heuristically Accelerated Reinforcement Learning

SBR-LARS '12 Proceedings of the 2012 Brazilian Robotics Symposium and Latin American Robotics Symposium
A Reinforcement Learning Approach to Setting Multi-Objective Goals for Energy Demand Management

International Journal of Agent Technologies and Systems
An empirical comparison of two common multiobjective reinforcement learning algorithms

AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Multi Objective Resource Scheduling in LTE Networks Using Reinforcement Learning

International Journal of Distributed Systems and Technologies
Linear fitted-Q iteration with multiple reward functions

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sequential decision-making problems with multiple objectives arise naturally in practice and pose unique challenges for research in decision-theoretic planning and learning, which has largely focused on single-objective settings. This article surveys algorithms designed for sequential decision-making problems with multiple objectives. Though there is a growing body of literature on this subject, little of it makes explicit under what circumstances special methods are needed to solve multi-objective problems. Therefore, we identify three distinct scenarios in which converting such a problem to a single-objective one is impossible, infeasible, or undesirable. Furthermore, we propose a taxonomy that classifies multi-objective methods according to the applicable scenario, the nature of the scalarization function (which projects multi-objective values to scalar ones), and the type of policies considered. We show how these factors determine the nature of an optimal solution, which can be a single policy, a convex hull, or a Pareto front. Using this taxonomy, we survey the literature on multi-objective methods for planning and learning. Finally, we discuss key applications of such methods and outline opportunities for future work.