Reinforcement Learning in POMDP's via Direct Gradient Ascent

Authors:
Jonathan Baxter;Peter L. Bartlett
Affiliations:
-;-
Venue:
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Year:
2000

Citing 0
Cited 16

Multiagent learning using a variable learning rate

Artificial Intelligence
Metalearning and neuromodulation

Neural Networks - Computational models of neuromodulation
Policy Gradients with Parameter-Based Exploration for Control

ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
State-Dependent Exploration for Policy Gradient Methods

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Predictive representations for policy gradient in POMDPs

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Direct Policy Search Reinforcement Learning for Robot Control

Proceedings of the 2005 conference on Artificial Intelligence Research and Development
Existence of multiagent equilibria with limited agents

Journal of Artificial Intelligence Research
Experiments with infinite-horizon, policy-gradient estimation

Journal of Artificial Intelligence Research
Simultaneous adversarial multi-robot learning

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Robot weightlifting by direct policy search

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Exploiting multiple secondary reinforcers in policy gradient reinforcement learning

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Online adaptive policies for ensemble classifiers

Neurocomputing
2010 Special Issue: Parameter-exploring policy gradients

Neural Networks
Decision-theoretic Optimal Sampling in Hidden Markov Random Fields

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Learning to make predictions in partially observable environments without a generative model

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Reinforcement Learning in POMDP's via Direct Gradient Ascent

Quantified Score

Visualization

Abstract