A model for reasoning about persistence and causation
Computational Intelligence
Multichain Markov decision processes with a sample path constraint: a decomposition approach
Mathematics of Operations Research
Planning and control
Reinforcement learning algorithms for average-payoff Markovian decision processes
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Using abstractions for decision-theoretic planning with time constraints
AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
An algorithm for probabilistic least-commitment planning
AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Dynamic Programming
Exploiting structure in policy construction
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Planning, learning and coordination in multiagent decision processes
TARK '96 Proceedings of the 6th conference on Theoretical aspects of rationality and knowledge
Complexity of probabilistic planning under average rewards
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Knowledge representation for stochastic decision processes
Artificial intelligence today
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
An average-reward reinforcement learning algorithm for computing bias-optimal policies
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
UAI'96 Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
We argue that many AI planning problems should be viewed as process-oriented, where the aim is to produce a policy or behavior strategy with no termination condition in mind, as opposed to goal-onented. The full power of Markov decision models, adopted recently for AI planning, becomes apparent with process-oriented problems. The question of appropriate optimality criteria becomes more critical in this case, we argue that average reward optimal is most suitable While construction of averageoptimal policies involves a number of subtleties and computational difficulties, certain aspects of the problem can be solved using compact action representations such as Bayes nets. In particular, we provide an algorithm that identifies the structure of the Markov process underlying a planning problem - a crucial element of constructing average optimal policies - without explicit enumeration of the problem state space.