A survey of multi-objective sequential decision-making
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
EDA-RL, Estimation of Distribution Algorithms for Reinforcement Learning Problems, have been proposed by us recently. The EDA-RL can improve policies by EDA scheme: First, select better episodes. Secondly, estimate probabilistic models, i.e., policies, and finally, interact with the environment for generating new episodes. In this paper, the EDA-RL is extended for Multi-Objective Reinforcement Learning Problems, where reward is given by several criteria. By incorporating the notions in Evolutionary Multi-Objective Optimization, the proposed method is enable to acquire various strategies by a single run.