Likelihood ratio gradient estimation for stochastic systems
Communications of the ACM - Special issue on simulation
IEEE/ACM Transactions on Networking (TON)
A tutorial on simulation optimization
WSC '92 Proceedings of the 24th conference on Winter simulation
Discrete optimization in simulation: a method and applications
WSC '92 Proceedings of the 24th conference on Winter simulation
ACM Transactions on Modeling and Computer Simulation (TOMACS)
A review of simulation optimization techniques
Proceedings of the 30th conference on Winter simulation
An overview of derivative estimation
WSC '91 Proceedings of the 23rd conference on Winter simulation
Optimization of stochastic systems
WSC '86 Proceedings of the 18th conference on Winter simulation
Optimization in simulation: a survey of recent results
WSC '87 Proceedings of the 19th conference on Winter simulation
Likelilood ratio gradient estimation: an overview
WSC '87 Proceedings of the 19th conference on Winter simulation
Future directions in response surface methodology for simulation
WSC '87 Proceedings of the 19th conference on Winter simulation
Simulation optimization methodologies
Proceedings of the 31st conference on Winter simulation: Simulation---a bridge to the future - Volume 1
Proceedings of the 33nd conference on Winter simulation
On the sensitivity analysis of the expected accumulated reward
Performance Evaluation
Approximate Gradient Methods in Policy-Space Optimization of Markov Reward Processes
Discrete Event Dynamic Systems
Proceedings of the 34th conference on Winter simulation: exploring new frontiers
Proceedings of the 34th conference on Winter simulation: exploring new frontiers
Sensitivity analysis for transient single server queuing models using an interpolation approach
WSC '04 Proceedings of the 36th conference on Winter simulation
Infinite-horizon policy-gradient estimation
Journal of Artificial Intelligence Research
Experiments with infinite-horizon, policy-gradient estimation
Journal of Artificial Intelligence Research
Journal of Computational and Applied Mathematics
The optimal reward baseline for gradient-based reinforcement learning
UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Variance reduction for sensitivity estimates obtained from regenerative simulation
Operations Research Letters
Hi-index | 0.00 |
In this paper, we introduce two convergent Monte Carlo algorithms for optimizing complex stochastic systems. The first algorithm, which is applicable to regenerative processes, operates by estimating finite differences. The second method is of Robbins-Monro type and is applicable to Markov chains. The algorithm is driven by derivative estimates obtained via a likelihood ratio argument.