Hard constrained semi-Markov decision processes

Authors:
Wai-Leong Yeow;Chen-Khong Tham;Wai-Choong Wong
Affiliations:
Graduate School for Integrative Sciences & Engineering, National University of Singapore, Singapore;Department of Electrical & Computer Engineering, National University of Singapore, Singapore;Institute for Infocomm Research, Singapore
Venue:
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Year:
2006

Citing 10
Cited 0

Technical Note: \cal Q-Learning

Machine Learning
Constrained discounted dynamic programming

Mathematics of Operations Research
Dynamic Programming and Optimal Control

Dynamic Programming and Optimal Control
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Multi-criteria Reinforcement Learning

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Planning under uncertainty in complex structured environments

Planning under uncertainty in complex structured environments
Learning to Communicate and Act Using Hierarchical Reinforcement Learning

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Approximating optimal policies for agents with limited execution resources

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
The complexity of plan existence and evaluation in robabilistic domains

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In multiple criteria Markov Decision Processes (MDP) where multiple costs are incurred at every decision point, current methods solve them by minimising the expected primary cost criterion while constraining the expectations of other cost criteria to some critical values. However, systems are often faced with hard constraints where the cost criteria should never exceed some critical values at any time, rather than constraints based on the expected cost criteria. For example, a resource-limited sensor network no longer functions once its energy is depleted. Based on the semi-MDP (sMDP) model, we study the hard constrained (HC) problem in continuous time, state and action spaces with respect to both finite and infinite horizons, and various cost criteria. We show that the HCsMDP problem is NP-hard and that there exists an equivalent discrete-time MDP to every HCsMDP. Hence, classical methods such as reinforcement learning can solve HCsMDPs.