On the Limitations of Scalarisation for Multi-objective Reinforcement Learning of Pareto Fronts

Authors:
Peter Vamplew;John Yearwood;Richard Dazeley;Adam Berry
Affiliations:
School of ITMS, University of Ballarat, Ballarat, Australia;School of ITMS, University of Ballarat, Ballarat, Australia;School of ITMS, University of Ballarat, Ballarat, Australia;School of ITMS, University of Ballarat, Ballarat, Australia
Venue:
AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Year:
2008

Citing 6
Cited 2

Multi-criteria Reinforcement Learning

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Reinforcement Learning with Bounded Risk

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Importance sampling for reinforcement learning with multiple objectives

Importance sampling for reinforcement learning with multiple objectives
A Geometric Approach to Multi-Criterion Reinforcement Learning

The Journal of Machine Learning Research
Dynamic preferences in multi-criteria reinforcement learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
Risk-sensitive reinforcement learning applied to control under constraints

Journal of Artificial Intelligence Research

Constructing Stochastic Mixture Policies for Episodic Multiobjective Reinforcement Learning Tasks

AI '09 Proceedings of the 22nd Australasian Joint Conference on Advances in Artificial Intelligence
A survey of multi-objective sequential decision-making

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multiobjective reinforcement learning (MORL) extends RL to problems with multiple conflicting objectives. This paper argues for designing MORL systems to produce a set of solutions approximating the Pareto front, and shows that the common MORL technique of scalarisation has fundamental limitations when used to find Pareto-optimal policies. The work is supported by the presentation of three new MORL benchmarks with known Pareto fronts.