A New Distributed Reinforcement Learning Algorithm for Multiple Objective Optimization Problems

Authors:
Carlos Mariano;Eduardo F. Morales
Affiliations:
-;-
Venue:
IBERAMIA-SBIA '00 Proceedings of the International Joint Conference, 7th Ibero-American Conference on AI: Advances in Artificial Intelligence
Year:
2000

Citing 3
Cited 4

Multiobjective evolutionary algorithm test suites

Proceedings of the 1999 ACM symposium on Applied computing
Sequential Optimality and Coordination in Multiagent Systems

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
A New Approach for the Solution of Multiple Objective Optimization Problems Based on Reinforcement Learning

MICAI '00 Proceedings of the Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence

Comparing Distributed Reinforcement Learning Approaches to Learn Agent Coordination

IBERAMIA 2002 Proceedings of the 8th Ibero-American Conference on AI: Advances in Artificial Intelligence
Multi-policy optimization in self-organizing systems

SOAR'09 Proceedings of the First international conference on Self-organizing architectures
Emergent consensus in decentralised systems using collaborative reinforcement learning

Self-star Properties in Complex Information Systems
A survey of multi-objective sequential decision-making

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a new algorithm, called MDQL, for the solution of multiple objective optimization problems. MDQL is based on a new distributed Q-learning algorithm, called DQL, which is also introduced in this paper. In DQL a family of independent agents, exploring different options, finds a common policy in a common environment. Information about action goodness is transmitted using traces over state-action pairs. MDQL extends this idea to multiple objectives, assigning a family of agents for each objective involved. A non-dominant criterion is used to construct Pareto fronts and by delaying adjustments on the rewards MDQL achieves better distributions of solutions. Furthermore, an extension for applying reinforcement learning to continuous functions is also given. Successful results of MDQL on several test-bed problems suggested in the literature are described.