Macro-operators: a weak method for learning
Artificial Intelligence - Lecture notes in computer science 178
World modeling for the dynamic construction of real-time control plans
Artificial Intelligence
Planning under time constraints in stochastic domains
Artificial Intelligence - Special volume on planning and scheduling
Predicting real-time planner performance by domain characterization
Predicting real-time planner performance by domain characterization
Abstraction and approximate decision-theoretic planning
Artificial Intelligence
Stochastic dynamic programming with factored representations
Artificial Intelligence
LAO: a heuristic search algorithm that finds solutions with loops
Artificial Intelligence - Special issue on heuristic search in artificial intelligence
Neuro-Dynamic Programming
Dynamic Non-uniform Abstractions for Approximate Planning in Large Structured Stochastic Domains
PRICAI '98 Proceedings of the 5th Pacific Rim International Conference on Artificial Intelligence: Topics in Artificial Intelligence
Acting Uncertainty: Discrete Bayesian Models for Mobile-Robot Navigation
Acting Uncertainty: Discrete Bayesian Models for Mobile-Robot Navigation
Representations and algorithms for large stochastic planning problems
Representations and algorithms for large stochastic planning problems
Solving large stochastic planning problems using multiple dynamic abstractions
Solving large stochastic planning problems using multiple dynamic abstractions
Practical solution techniques for first-order MDPs
Artificial Intelligence
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Efficient solution algorithms for factored MDPs
Journal of Artificial Intelligence Research
Macro-FF: improving AI planning with automatically learned macro-operators
Journal of Artificial Intelligence Research
Variable resolution discretization for high-accuracy solutions of optimal control problems
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Exploiting structure in policy construction
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Learning to act using real-time dynamic programming
Artificial Intelligence
Magnifying-lens abstraction for Markov decision processes
CAV'07 Proceedings of the 19th international conference on Computer aided verification
Anytime synthetic projection: maximizing the probability of goal satisfaction
AAAI'90 Proceedings of the eighth National conference on Artificial intelligence - Volume 1
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Continuous value function approximation for sequential bidding policies
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
SPUDD: stochastic planning using decision diagrams
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Hierarchical solution of Markov decision processes using macro-actions
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Correlated action effects in decision theoretic regression
UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
In a deterministic world, a planning agent can be certain of the consequences of its planned sequence of actions. Not so, however, in dynamic, stochastic domains where Markov decision processes are commonly used. Unfortunately these suffer from the 'curse of dimensionality': if the state space is a Cartesian product of many small sets ('dimensions'), planning is exponential in the number of those dimensions. Our new technique exploits the intuitive strategy of selectively ignoring various dimensions in different parts of the state space. The resulting non-uniformity has strong implications, since the approximation is no longer Markovian, requiring the use of a modified planner. We also use a spatial and temporal proximity measure, which responds to continued planning as well as movement of the agent through the state space, to dynamically adapt the abstraction as planning progresses. We present qualitative and quantitative results across a range of experimental domains showing that an agent exploiting this novel approximation method successfully finds solutions to the planning problem using much less than the full state space. We assess and analyse the features of domains which our method can exploit.