Reactive Navigation Using Reinforment Learning in Situations of POMDPs

  • Authors:
  • Paolo Puliti;Guido Tascini;Anna Montesanto

  • Affiliations:
  • -;-;-

  • Venue:
  • IWANN '01 Proceedings of the 6th International Work-Conference on Artificial and Natural Neural Networks: Bio-inspired Applications of Connectionism-Part II
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

The aim of this work is to individualize an architecture that allows the reactive navigation through an unsupervised learning based on the reinforcement learning. To reach the objective quoted, we used the Q-learning and one hirerarchical struture of the architecture developed. To use these techniques in presence of Partially Observable Markov Decision Processes (POMDP) is necessary introduce some innovations: heuristic techniques for the generalization of the experience and for the treatment of the partial observability, a technique for the speed adjournment of the Q function and the definition of reinforcement policy adequate for the unsupervised learning of a complex assignment. The results show a satisfactory learning of the assignment of navigation in a simulated environment.