Reactive Navigation Using Reinforment Learning in Situations of POMDPs

Authors:
Paolo Puliti;Guido Tascini;Anna Montesanto
Affiliations:
-;-;-
Venue:
IWANN '01 Proceedings of the 6th International Work-Conference on Artificial and Natural Neural Networks: Bio-inspired Applications of Connectionism-Part II
Year:
2001

Citing 4
Cited 0

Planning and acting in partially observable stochastic domains

Artificial Intelligence
Estimation and Approximation Bounds for Gradient-Based Reinforcement Learning

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Efficient dynamic-programming updates in partially observable Markov decision processes

Efficient dynamic-programming updates in partially observable Markov decision processes
Solving POMDPs by searching in policy space

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

The aim of this work is to individualize an architecture that allows the reactive navigation through an unsupervised learning based on the reinforcement learning. To reach the objective quoted, we used the Q-learning and one hirerarchical struture of the architecture developed. To use these techniques in presence of Partially Observable Markov Decision Processes (POMDP) is necessary introduce some innovations: heuristic techniques for the generalization of the experience and for the treatment of the partial observability, a technique for the speed adjournment of the Q function and the definition of reinforcement policy adequate for the unsupervised learning of a complex assignment. The results show a satisfactory learning of the assignment of navigation in a simulated environment.