Artificial Intelligence
Artificial intelligence: a modern approach
Artificial intelligence: a modern approach
An Upper Bound on the Loss from Approximate Optimal-Value Functions
Machine Learning
Learning to act using real-time dynamic programming
Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
Solving very large weakly coupled Markov decision processes
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Finite-sample convergence rates for Q-learning and indirect algorithms
Proceedings of the 1998 conference on Advances in neural information processing systems II
Reinforcement Learning
The Design and Analysis of Computer Algorithms
The Design and Analysis of Computer Algorithms
Computing Factored Value Functions for Policies in Structured MDPs
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Exploiting structure in policy construction
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
A robust and fast action selection mechanism for planning
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Tractable inference for complex stochastic processes
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Parallel Rollout for Online Solution of Partially Observable Markov Decision Processes
Discrete Event Dynamic Systems
Reinforcement learning for active model selection
UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
APPSSAT: Approximate probabilistic planning using stochastic satisfiability
International Journal of Approximate Reasoning
Recent Advances in Reinforcement Learning
Optimistic Planning of Deterministic Systems
Recent Advances in Reinforcement Learning
Approximate inference for planning in stochastic relational worlds
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Compact, convex upper bound iteration for approximate POMDP planning
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Action selection in Bayesian reinforcement learning
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Learning the Difference between Partially Observable Dynamical Systems
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Learning planning rules in noisy stochastic worlds
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Thresholded rewards: acting optimally in timed, zero-sum games
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
On polynomial sized MDP succinct policies
Journal of Artificial Intelligence Research
Journal of Artificial Intelligence Research
Learning symbolic models of stochastic domains
Journal of Artificial Intelligence Research
Monte Carlo sampling methods for approximating interactive POMDPs
Journal of Artificial Intelligence Research
Online learning and exploiting relational models in reinforcement learning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
The value of observation for monitoring dynamic systems
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
UCT for tactical assault planning in real-time strategy games
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
A survey of collaborative filtering techniques
Advances in Artificial Intelligence
MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
Exploring continuous action spaces with diffusion trees for reinforcement learning
ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part II
Reducing reinforcement learning to KWIK online regression
Annals of Mathematics and Artificial Intelligence
Systematic improvement of Monte-Carlo tree search with self-generated neural-networks controllers
LION'10 Proceedings of the 4th international conference on Learning and intelligent optimization
Planning with noisy probabilistic relational rules
Journal of Artificial Intelligence Research
Towards proactive event-driven computing
Proceedings of the 5th ACM international conference on Distributed event-based system
Efficient planning under uncertainty with macro-actions
Journal of Artificial Intelligence Research
Distributed model shaping for scaling to decentralized POMDPs with hundreds of agents
The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
APPSSAT: approximate probabilistic planning using stochastic satisfiability
ECSQARU'05 Proceedings of the 8th European conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Admission control policies for a multi-class QoS-aware service oriented architecture
ACM SIGMETRICS Performance Evaluation Review
Approximate planning and verification for large markov decision processes
Proceedings of the 27th Annual ACM Symposium on Applied Computing
When do differences matter? On-line feature extraction through cognitive economy
Cognitive Systems Research
Bootstrapping monte carlo tree search with an imperfect heuristic
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Testing probabilistic equivalence through Reinforcement Learning
Information and Computation
Robotics and artificial intelligence: A perspective on deliberation functions
AI Communications - ECAI 2012 Turing and Anniversary Track
Hi-index | 0.00 |
A critical issue for the application of Markov decision processes (MDPs) to realistic problems is how the complexity of planning scales with the size of the MDP. In stochastic environments with very large or infinite state spaces, traditional planning and reinforcement learning algorithms may be inapplicable, since their running time typically grows linearly with the state space size in the worst case. In this paper we present a new algorithm that, given only a generative model (a natural and common type of simulator) for an arbitrary MDP, performs on-line, near-optimal planning with a per-state running time that has no dependence on the number of states. The running time is exponential in the horizon time (which depends only on the discount factor γ and the desired degree of approximation to the optimal policy). Our algorithm thus provides a different complexity trade-off than classical algorithms such as value iteration—rather than scaling linearly in both horizon time and state space size, our running time trades an exponential dependence on the former in exchange for no dependence on the latter.Our algorithm is based on the idea of sparse sampling. We prove that a randomly sampled look-ahead tree that covers only a vanishing fraction of the full look-ahead tree nevertheless suffices to compute near-optimal actions from any state of an MDP. Practical implementations of the algorithm are discussed, and we draw ties to our related recent results on finding a near-best strategy from a given class of strategies in very large partially observable MDPs (Kearns, Mansour, & Ng. Neural information processing systems 13, to appear).