Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Neural networks and the bias/variance dilemma
Neural Computation
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neural Networks: Tricks of the Trade, this book is an outgrowth of a 1996 NIPS workshop
Least-squares policy iteration
The Journal of Machine Learning Research
Kernel rewards regression: an information efficient batch policy iteration approach
AIA'06 Proceedings of the 24th IASTED international conference on Artificial intelligence and applications
COLT'06 Proceedings of the 19th annual conference on Learning Theory
ECML'05 Proceedings of the 16th European conference on Machine Learning
Gradient calculations for dynamic recurrent neural networks: a survey
IEEE Transactions on Neural Networks
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
In this paper we present two substantial extensions of Neural Rewards Regression (NRR) [1]. In order to give a less biased estimator of the Bellman Residual and to facilitate the regression character of NRR, we incorporate an improved, Auxiliared Bellman Residual [2] and provide, to the best of our knowledge, the first Neural Network based implementation of the novel Bellman Residual minimisation technique. Furthermore, we extend NRR to Policy Gradient Neural Rewards Regression (PGNRR), where the strategy is directly encoded by a policy network. PGNRR profits from both the data-efficiency of the Rewards Regression approach and the directness of policy search methods. PGNRR further overcomes a crucial drawback of NRR as it extends the accordant problem class considerably by the applicability of continuous action spaces.