Reactive Navigation Using Reinforment Learning in Situations of POMDPs
IWANN '01 Proceedings of the 6th International Work-Conference on Artificial and Natural Neural Networks: Bio-inspired Applications of Connectionism-Part II
Optimizing Average Reward Using Discounted Rewards
COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Exploiting random walks for learning
Information and Computation
Infinite-horizon policy-gradient estimation
Journal of Artificial Intelligence Research
Experiments with infinite-horizon, policy-gradient estimation
Journal of Artificial Intelligence Research
Hi-index | 0.00 |