SIAM Journal on Algebraic and Discrete Methods
Feature-based methods for large scale dynamic programming
Machine Learning - Special issue on reinforcement learning
Dynamic Programming and Optimal Control, Two Volume Set
Dynamic Programming and Optimal Control, Two Volume Set
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Stochastic Dynamic Programming: Probability and Mathematical
Introduction to Stochastic Dynamic Programming: Probability and Mathematical
Neuro-Dynamic Programming
Discrete Event Dynamic Systems
From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning
Discrete Event Dynamic Systems
CONVERGENCE OF SIMULATION-BASED POLICY ITERATION
Probability in the Engineering and Informational Sciences
An Algorithmic Approach for Sensitivity Analysis of Perturbed Quasi-Birth-and-Death Processes
Queueing Systems: Theory and Applications
Basic Ideas for Event-Based Optimization of Markov Systems
Discrete Event Dynamic Systems
The optimal robust control policy for uncertain semi-Markov control processes
International Journal of Systems Science
STEWARD: demo of spatio-textual extraction on the web aiding the retrieval of documents
dg.o '07 Proceedings of the 8th annual international conference on Digital government research: bridging disciplines & domains
Error bounds of optimization algorithms for semi-Markov decision processes
International Journal of Systems Science
Learning Representation and Control in Markov Decision Processes: New Frontiers
Foundations and Trends® in Machine Learning
Automatica (Journal of IFAC)
A time aggregation approach to Markov decision processes
Automatica (Journal of IFAC)
The control of a two-level Markov decision process by time aggregation
Automatica (Journal of IFAC)
Hi-index | 0.01 |
This paper provides an introductory discussion for an importantconcept, the performance potentials of Markov processes, and its relationswith perturbation analysis (PA), average-cost Markov decision processes(MDP), Poisson equations, &agr;-potentials, the fundamentalmatrix, and the group inverse of the transition matrix (or the infinitesimalgenerators). Applications to single sample path-based performancesensitivity estimation and performance optimization are also discussed.On-line algorithms for performance sensitivity estimates and on-line schemesfor policy iteration methods are presented. The approach is closely relatedto reinforcement learning algorithms.