Adaptive algorithms and stochastic approximations
Adaptive algorithms and stochastic approximations
Learning to act using real-time dynamic programming
Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
Fast planning through planning graph analysis
Artificial Intelligence
Relational reinforcement learning
Machine Learning - Special issue on inducive logic programming
LAO: a heuristic search algorithm that finds solutions with loops
Artificial Intelligence - Special issue on heuristic search in artificial intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
A Multi-Agent Policy-Gradient Approach to Network Routing
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Scalable Internal-State Policy-Gradient Methods for POMDPs
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Reinforcement Learning in POMDPs with Function Approximation
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Conjugate Directions for Stochastic Gradient Descent
ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
Sequential Optimality and Coordination in Multiagent Systems
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Learning to Cooperate via Policy Search
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
The Complexity of Decentralized Control of Markov Decision Processes
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Coordinated Reinforcement Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
The Journal of Machine Learning Research
MAPGEN: Mixed-Initiative Planning and Scheduling for the Mars Exploration Rover Mission
IEEE Intelligent Systems
Prottle: a probabilistic temporal planner
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
The FF planning system: fast plan generation through heuristic search
Journal of Artificial Intelligence Research
PDDL2.1: an extension to PDDL for expressing temporal planning domains
Journal of Artificial Intelligence Research
Journal of Artificial Intelligence Research
Planning with durative actions in stochastic domains
Journal of Artificial Intelligence Research
Experiments with infinite-horizon, policy-gradient estimation
Journal of Artificial Intelligence Research
When is temporal planning really temporal?
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Discriminative learning of beam-search heuristics for planning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
SPUDD: stochastic planning using decision diagrams
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
ECML'05 Proceedings of the 16th European conference on Machine Learning
Planning with Concurrency under Resources and Time Uncertainty
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Planning with noisy probabilistic relational rules
Journal of Artificial Intelligence Research
Discovering hidden structure in factored MDPs
Artificial Intelligence
Active visual sensing and collaboration on mobile robots using hierarchical POMDPs
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Hi-index | 0.00 |
We present an any-time concurrent probabilistic temporal planner (CPTP) that includes continuous and discrete uncertainties and metric functions. Rather than relying on dynamic programming our approach builds on methods from stochastic local policy search. That is, we optimise a parameterised policy using gradient ascent. The flexibility of this policy-gradient approach, combined with its low memory use, the use of function approximation methods and factorisation of the policy, allow us to tackle complex domains. This factored policy gradient (FPG) planner can optimise steps to goal, the probability of success, or attempt a combination of both. We compare the FPG planner to other planners on CPTP domains, and on simpler but better studied non-concurrent non-temporal probabilistic planning (PP) domains. We present FPG-ipc, the PP version of the planner which has been successful in the probabilistic track of the fifth international planning competition.