Markov decision models with weighted discounted criteria
Mathematics of Operations Research
Hamiltonian cycles and Markov chains
Mathematics of Operations Research
Constrained Markov decision models with weighted discounted rewards
Mathematics of Operations Research
Constrained discounted dynamic programming
Mathematics of Operations Research
Constrained Discounted Markov Decision Processes and Hamiltonian Cycles
Mathematics of Operations Research
Dynamic Programming
Towards a Formalization of Teamwork with Resource Constraints
AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 2
Commitment-driven distributed joint policy search
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
MICAI '08 Proceedings of the 7th Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Towards faster planning with continuous resources in stochastic domains
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Probabilistic planning for continuous dynamic systems under bounded risk
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
We consider the problem of policy optimization for a resource-limited agent with multiple time-dependent objectives, represented as an MDP with multiple discount factors in the objective function and constraints. We show that limiting search to stationary deterministic policies, coupled with a novel problem reduction to mixed integer programming, yields an algorithm for finding such policies that is computationally feasible, where no such algorithm has heretofore been identified. In the simpler case where the constrained MDP has a single discount factor, our technique provides a new way for finding an optimal deterministic policy, where previous methods could only find randomized policies. We analyze the properties of our approach and describe implementation results.