Journal of Optimization Theory and Applications
On line optimization of simulated Markovian processes
Mathematics of Operations Research
SIAM Journal on Control and Optimization
Queueing Systems: Theory and Applications
SIAM Journal on Control and Optimization
Stochastic quasigradient methods for optimization of discrete event systems
Annals of Operations Research - Special issue on sensitivity analysis and optimization of discrete event systems
Journal of Optimization Theory and Applications
Stochastic Approximation Methods for Systems Over an InfiniteHorizon
SIAM Journal on Control and Optimization
Likelilood ratio gradient estimation: an overview
WSC '87 Proceedings of the 19th conference on Winter simulation
Theory, Volume 1, Queueing Systems
Theory, Volume 1, Queueing Systems
Applications of a Kushner and Clark lemma to general classes of stochastic algorithms
IEEE Transactions on Information Theory
Weak convergence and asymptotic properties of adaptive filters with constant gains
IEEE Transactions on Information Theory
Gradient estimation for discrete-event systems by measure-valued differentiation
ACM Transactions on Modeling and Computer Simulation (TOMACS)
A Perturbation Analysis Approach to Phantom Estimators for Waiting Times in the G/G/1 Queue
Discrete Event Dynamic Systems
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Hi-index | 22.14 |
This paper analyzes the behavior of adaptive control schemes for automatic learning. Estimates of the sensitivities are used in a gradient-based stochastic approximation procedure, in order to drive the process along the steepest descent trajectory in search for the optimum. The learning rates are kept constant for adaptability. For such procedures, convergence can be established in a weak sense. A model problem of a flexible machine is presented, for which the control parameter is a probability vector. We propose a new sensitivity estimator, generalizing the phantom rare perturbation analysis (RPA) estimator to multi-valued decisions. From the basic properties of the estimators, we build several updating rules based on the weak convergence theory to ensure asymptotic optimality. We illustrate the predicted theoretical behavior with computer simulations. Finally, we present the comparison between the behavior of our proposed scheme with a regenerative one for which we can establish strong convergence. Our results show that weak convergence yields a dramatic improvement in the rate of convergence, in addition to the capability of adaptation, or tracking.