On the Limitations of Scalarisation for Multi-objective Reinforcement Learning of Pareto Fronts

  • Authors:
  • Peter Vamplew;John Yearwood;Richard Dazeley;Adam Berry

  • Affiliations:
  • School of ITMS, University of Ballarat, Ballarat, Australia;School of ITMS, University of Ballarat, Ballarat, Australia;School of ITMS, University of Ballarat, Ballarat, Australia;School of ITMS, University of Ballarat, Ballarat, Australia

  • Venue:
  • AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Multiobjective reinforcement learning (MORL) extends RL to problems with multiple conflicting objectives. This paper argues for designing MORL systems to produce a set of solutions approximating the Pareto front, and shows that the common MORL technique of scalarisation has fundamental limitations when used to find Pareto-optimal policies. The work is supported by the presentation of three new MORL benchmarks with known Pareto fronts.