Multi-reward policies for medical applications: anthrax attacks and smart wheelchairs

Authors:
Harold Soh;Yiannis Demiris
Affiliations:
Imperial College London, London, United Kingdom;Imperial College London, London, United Kingdom
Venue:
Proceedings of the 13th annual conference companion on Genetic and evolutionary computation
Year:
2011

Citing 8
Cited 0

Learning internal representations by error propagation

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
A Fast Elitist Non-dominated Sorting Genetic Algorithm for Multi-objective Optimisation: NSGA-II

PPSN VI Proceedings of the 6th International Conference on Parallel Problem Solving from Nature
A New Conjugate Gradient Method with Guaranteed Descent and an Efficient Line Search

SIAM Journal on Optimization
Combining gradient techniques for numerical multi-objective evolutionary optimization

Proceedings of the 8th annual conference on Genetic and evolutionary computation
Covariance Matrix Adaptation for Multi-objective Optimization

Evolutionary Computation
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Evolving policies for multi-reward partially observable markov decision processes (MR-POMDPs)

Proceedings of the 13th annual conference on Genetic and evolutionary computation
Meta-Lamarckian learning in memetic algorithms

IEEE Transactions on Evolutionary Computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Medical decisions are often difficult; they involve uncertain information, multiple-objectives and debatable outcomes. In this work, we discuss the application of the multi-reward partially-observable Markov decision process (MR-POMDP) and NSGA2-LS, a hybridised multi-objective evolutionary solver, to two problems in the medical domain: anthrax response and smart-wheelchair control. For the first problem, we use a discrete model and analyse the trade-offs between the best solutions (in the form of finite-state controllers) found by our evolutionary algorithm. For the second, we contribute an extension of our method to the continuous space and optimising recurrent neural networks (RNNs) for use on medical robots such as smart wheelchairs.