A Hybrid Genetic/Optimization Algorithm for Finite-Horizon, Partially Observed Markov Decision Processes

Authors:
Zong-Zhi Lin;James C. Bean;Chelsea C. White
Affiliations:
-;-;-
Venue:
INFORMS Journal on Computing
Year:
2004

Citing 0
Cited 4

Simulation-Based Optimization Algorithms for Finite-Horizon Markov Decision Processes

Simulation
Reinforcement Learning: A Tutorial Survey and Recent Advances

INFORMS Journal on Computing
A survey on metaheuristics for stochastic combinatorial optimization

Natural Computing: an international journal
Combining metaheuristics and exact algorithms in combinatorial optimization: a survey and classification

IWINAC'05 Proceedings of the First international work-conference on the Interplay Between Natural and Artificial Computation conference on Artificial Intelligence and Knowledge Engineering Applications: a bioinspired approach - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

The partially observed Markov decision process (POMDP) is a generalization of a Markov decision process that allows for noise-corrupted and costly observations of the underlying system state. The value function of the infinite horizon POMDP is known to be piecewise affine and convex in the probability mass vector over the state space. Such a function can be represented by a finite set of affine functions.In this paper, we develop and evaluate an exact algorithm, GAMIP, which combines a genetic algorithm and a mixed integer program to construct the minimal set of affine functions that describes the value function. Numerical results indicate that GAMIP takes up to 60% less time to construct the minimal set than does the most efficient linear programming-based exact solution method in the literature.