Feature extraction for decision-theoretic planning in partially observable environments

  • Authors:
  • Hajime Fujita;Yutaka Nakamura;Shin Ishii

  • Affiliations:
  • Graduate School of Information Science, Nara Institute of Science and Technology (NAIST), Ikoma, Japan;Graduate School of Information Science, Nara Institute of Science and Technology (NAIST), Ikoma, Japan;Graduate School of Information Science, Nara Institute of Science and Technology (NAIST), Ikoma, Japan

  • Venue:
  • ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this article, we propose a feature extraction technique for decision-theoretic planning problems in partially observable stochastic domains and show a novel approach for solving them. To maximize an expected future reward, all the agent has to do is to estimate a Markov chain over a statistic variable related to rewards. In our approach, an auxiliary state variable whose stochastic process satisfies the Markov property, called internal state, is introduced to the model with the assumption that the rewards are dependent on the pair of an internal state and an action. The agent then estimates the dynamics of an internal state model based on the maximum likelihood inference made while acquiring its policy; the internal state model represents an essential feature necessary to decision-making. Computer simulation results show that our technique can find an appropriate feature for acquiring a good policy, and can achieve faster learning with fewer policy parameters than a conventional algorithm, in a reasonably sized partially observable problem.