Linear fitted-Q iteration with multiple reward functions

Authors:
Daniel J. Lizotte;Michael Bowling;Susan A. Murphy
Affiliations:
David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada;Department of Computing Science, University of Alberta, Edmonton, AB, Canada;Department of Statistics, University of Michigan, Ann Arbor, MI
Venue:
The Journal of Machine Learning Research
Year:
2012

Citing 18
Cited 1

Constrained Delaunay triangulations

SCG '87 Proceedings of the third annual symposium on Computational geometry
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Multi-criteria Reinforcement Learning

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Algorithms for Inverse Reinforcement Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A POMDP formulation of preference elicitation problems

Eighteenth national conference on Artificial intelligence
A Geometric Approach to Multi-Criterion Reinforcement Learning

The Journal of Machine Learning Research
Dynamic preferences in multi-criteria reinforcement learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
Multicriteria Optimization

Multicriteria Optimization
Automatic basis function construction for approximate dynamic programming and reinforcement learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Computational Geometry: Algorithms and Applications

Computational Geometry: Algorithms and Applications
Learning all optimal policies with multiple criteria

Proceedings of the 25th international conference on Machine learning
Compact, convex upper bound iteration for approximate POMDP planning

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Anytime point-based approximations for large POMDPs

Journal of Artificial Intelligence Research
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Algorithms for Reinforcement Learning

Algorithms for Reinforcement Learning
Empirical evaluation methods for multiobjective reinforcement learning algorithms

Machine Learning
Informing sequential clinical decision-making through reinforcement learning: an empirical study

Machine Learning

A survey of multi-objective sequential decision-making

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a general and detailed development of an algorithm for finite-horizon fitted-Q iteration with an arbitrary number of reward signals and linear value function approximation using an arbitrary number of state features. This includes a detailed treatment of the 3-reward function case using triangulation primitives from computational geometry and a method for identifying globally dominated actions. We also present an example of how our methods can be used to construct a realworld decision aid by considering symptom reduction, weight gain, and quality of life in sequential treatments for schizophrenia. Finally, we discuss future directions in which to take this work that will further enable our methods to make a positive impact on the field of evidence-based clinical decision support.