Learning to act using real-time dynamic programming
Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
LAO: a heuristic search algorithm that finds solutions with loops
Artificial Intelligence - Special issue on heuristic search in artificial intelligence
Neuro-Dynamic Programming
Heuristic search value iteration for POMDPs
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees
ICML '05 Proceedings of the 22nd international conference on Machine learning
Speeding up the convergence of value iteration in partially observable Markov decision processes
Journal of Artificial Intelligence Research
Faster heuristic search algorithms for planning with uncertainty and full feedback
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
A Q-decomposition LRTDP Approach to Resource Allocation
IAT '06 Proceedings of the IEEE/WIC/ACM international conference on Intelligent Agent Technology
A Q-decomposition and bounded RTDP approach to resource allocation
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Reasoning for a multi-modal service robot considering uncertainty in human-robot interaction
Proceedings of the 3rd ACM/IEEE international conference on Human robot interaction
Adaptive multi-robot wide-area exploration and mapping
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
R-FRTDP: A Real-Time DP Algorithm with Tight Bounds for a Stochastic Resource Allocation Problem
CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
A Markovian model for dynamic and constrained resource allocation problems
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Bayesian real-time dynamic programming
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Accelerating point-based POMDP algorithms via greedy strategies
SIMPAR'10 Proceedings of the Second international conference on Simulation, modeling, and programming for autonomous robots
Topological value iteration algorithms
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
Real-time dynamic programming (RTDP) is a heuristic search algorithm for solving MDPs. We present a modified algorithm called Focused RTDP with several improvements. While RTDP maintains only an upper bound on the long-term reward function, FRTDP maintains two-sided bounds and bases the output policy on the lower bound. FRTDP guides search with a new rule for outcome selection, focusing on parts of the search graph that contribute most to uncertainty about the values of good policies. FRTDP has modified trial termination criteria that should allow it to solve some problems (within Ε) that RTDP cannot. Experiments show that for all the problems we studied, FRTDP significantly outperforms RTDP and LRTDP, and converges with up to six times fewer backups than the state-of-the-art HDP algorithm.