Evolving policies for multi-reward partially observable markov decision processes (MR-POMDPs)

Authors:
Harold Soh;Yiannis Demiris
Affiliations:
Imperial College London, London, United Kingdom;Imperial College London, London, United Kingdom
Venue:
Proceedings of the 13th annual conference on Genetic and evolutionary computation
Year:
2011

Citing 21
Cited 2

The complexity of Markov decision processes

Mathematics of Operations Research
On the undecidability of probabilistic planning and infinite-horizon partially observable Markov decision problems

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence

Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence
Multi-Objective Optimization Using Evolutionary Algorithms

Multi-Objective Optimization Using Evolutionary Algorithms
A Fast Elitist Non-dominated Sorting Genetic Algorithm for Multi-objective Optimisation: NSGA-II

PPSN VI Proceedings of the 6th International Conference on Parallel Problem Solving from Nature
Exploiting gradient information in numerical multi--objective evolutionary optimization

GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
A New Conjugate Gradient Method with Guaranteed Descent and an Efficient Line Search

SIAM Journal on Optimization
Comparison of Multiobjective Evolutionary Algorithms: Empirical Results

Evolutionary Computation
Combining gradient techniques for numerical multi-objective evolutionary optimization

Proceedings of the 8th annual conference on Genetic and evolutionary computation
Local search for multiobjective function optimization: pareto descent method

Proceedings of the 8th annual conference on Genetic and evolutionary computation
Covariance Matrix Adaptation for Multi-objective Optimization

Evolutionary Computation
Point-Based Value Iteration for Continuous POMDPs

The Journal of Machine Learning Research
An overview of evolutionary algorithms in multiobjective optimization

Evolutionary Computation
Learning all optimal policies with multiple criteria

Proceedings of the 25th international conference on Machine learning
Shark

The Journal of Machine Learning Research
Stochastic local search for POMDP controllers

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Planning and acting in partially observable stochastic domains

Artificial Intelligence
PISA: a platform and programming language independent interface for search algorithms

EMO'03 Proceedings of the 2nd international conference on Evolutionary multi-criterion optimization
Solving POMDPs by searching the space of finite policies

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Multiobjective evolutionary algorithms: a comparative case studyand the strength Pareto approach

IEEE Transactions on Evolutionary Computation
Meta-Lamarckian learning in memetic algorithms

IEEE Transactions on Evolutionary Computation

Multi-reward policies for medical applications: anthrax attacks and smart wheelchairs

Proceedings of the 13th annual conference companion on Genetic and evolutionary computation
A survey of multi-objective sequential decision-making

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Plans and decisions in many real-world scenarios are made under uncertainty and to satisfy multiple, possibly conflicting, objectives. In this work, we contribute the multi-reward partially-observable Markov decision process (MR-POMDP) as a general modelling framework. To solve MR-POMDPs, we present two hybrid (memetic) multi-objective evolutionary algorithms that generate non-dominated sets of policies (in the form of stochastic finite state controllers). Performance comparisons between the methods on multi-objective problems in robotics (with 2, 3 and 5 objectives), web-advertising (with 3, 4 and 5 objectives) and infectious disease control (with 3 objectives), revealed that memetic variants outperformed their original counterparts. We anticipate that the MR-POMDP along with multi-objective evolutionary solvers will prove useful in a variety of theoretical and real-world applications.