Basic Ideas for Event-Based Optimization of Markov Systems

Authors:
Xi-Ren Cao
Affiliations:
Department of Electrical and Electronic Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong
Venue:
Discrete Event Dynamic Systems
Year:
2005

Citing 13
Cited 3

Technical Note: \cal Q-Learning

Machine Learning
Single sample path-based optimization of Markov chains

Journal of Optimization Theory and Applications - Special issue in honor of Yu-Chi Ho
Dynamic Programming and Optimal Control, Two Volume Set

Dynamic Programming and Optimal Control, Two Volume Set
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
The Relations Among Potentials, Perturbation Analysis,and Markov Decision Processes

Discrete Event Dynamic Systems
Recent Advances in Hierarchical Reinforcement Learning

Discrete Event Dynamic Systems
CONVERGENCE OF SIMULATION-BASED POLICY ITERATION

Probability in the Engineering and Informational Sciences
Infinite-horizon policy-gradient estimation

Journal of Artificial Intelligence Research
Experiments with infinite-horizon, policy-gradient estimation

Journal of Artificial Intelligence Research
Learning finite-state controllers for partially observable environments

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Technical Communique: A unified approach to Markov decision problems and performance sensitivity analysis

Automatica (Journal of IFAC)
A time aggregation approach to Markov decision processes

Automatica (Journal of IFAC)
A unified approach to Markov decision problems and performance sensitivity analysis with discounted and average criteria: multichain cases

Automatica (Journal of IFAC)

Policy iteration for customer-average performance optimization of closed queueing systems

Automatica (Journal of IFAC)
Using Markov chain analysis to study dynamic behaviour in large-scale grid systems

AusGrid '09 Proceedings of the Seventh Australasian Symposium on Grid Computing and e-Research - Volume 99
Admission control with elastic QoS for video on demand systems

International Journal of Automation and Computing

Quantified Score

Hi-index	0.02

Visualization

Abstract

The goal of this paper is two-fold: First, we present a sensitivity point of view on the optimization of Markov systems. We show that Markov decision processes (MDPs) and the policy-gradient approach, or perturbation analysis (PA), can be derived easily from two fundamental sensitivity formulas, and such formulas can be flexibly constructed, by first principles, with performance potentials as building blocks. Second, with this sensitivity view we propose an event-based optimization approach, including the event-based sensitivity analysis and event-based policy iteration. This approach utilizes the special feature of a system characterized by events and illustrates how the potentials can be aggregated using the special feature and how the aggregated potential can be used in policy iteration. Compared with the traditional MDP approach, the event-based approach has its advantages: the number of aggregated potentials may scale to the system size despite that the number of states grows exponentially in the system size, this reduces the policy space and saves computation; the approach does not require actions at different states to be independent; and it utilizes the special feature of a system and does not need to know the exact transition probability matrix. The main ideas of the approach are illustrated by an admission control problem.