Cross-entropic learning of a machine for the decision in a partially observable universe

Authors:
Frédéric Dambreville
Affiliations:
Délégation Générale pour l'Armement, DGA/DET/CEP/ASC/GIP, Arcveil, France F 94114
Venue:
Journal of Global Optimization
Year:
2007

Citing 0
Cited 1

Learning to play using low-complexity rule-based policies: illustrations through Ms. Pac-Man

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we are interested in optimal decisions in a partially observable universe. Our approach is to directly approximate an optimal strategic tree depending on the observation. This approximation is made by means of a parameterized probabilistic law. A particular family of Hidden Markov Models (HMM), with input and output, is considered as a model of policy. A method for optimizing the parameters of these HMMs is proposed and applied. This optimization is based on the cross-entropic (CE) principle for rare events simulation developed by Rubinstein.