Strong points of weak convergence: a study using RPA gradient estimation for automatic learning

Authors:
Felisa J. Vázquez-Abad
Affiliations:
Department of Computer Science and Operations Research, University of Montreal, Canada and Department of Electrical and Electronic Engineering, The University of Melbourne, Australia
Venue:
Automatica (Journal of IFAC)
Year:
1999

Citing 14
Cited 3

Convergence of a stochastic approximation algorithm for the GI/G/1 queue using infinitesimal perturbation

Journal of Optimization Theory and Applications
On line optimization of simulated Markovian processes

Mathematics of Operations Research
Stochastic approximations and adaptive control of a discrete-time single-server network with random routing

SIAM Journal on Control and Optimization
On the pathwise computation of derivatives with respect to the rate of a point process: the phantom RPA method

Queueing Systems: Theory and Applications
Optimization of queues using an infinitesimal perturbation analysis-based stochastic algorithm with general update times

SIAM Journal on Control and Optimization
Stochastic quasigradient methods for optimization of discrete event systems

Annals of Operations Research - Special issue on sensitivity analysis and optimization of discrete event systems
Using stochastic optimization to determine threshold values for the control of unreliable manufacturing systems

Journal of Optimization Theory and Applications
Stochastic optimization by simulation: convergence proofs for the GI/G/1 queue in steady-state

Management Science
Stochastic Approximation Methods for Systems Over an InfiniteHorizon

SIAM Journal on Control and Optimization
Optimization of the transient and steady state behavior of discrete event systems

Management Science
Likelilood ratio gradient estimation: an overview

WSC '87 Proceedings of the 19th conference on Winter simulation
Theory, Volume 1, Queueing Systems

Theory, Volume 1, Queueing Systems
Applications of a Kushner and Clark lemma to general classes of stochastic algorithms

IEEE Transactions on Information Theory
Weak convergence and asymptotic properties of adaptive filters with constant gains

IEEE Transactions on Information Theory

Gradient estimation for discrete-event systems by measure-valued differentiation

ACM Transactions on Modeling and Computer Simulation (TOMACS)
A Perturbation Analysis Approach to Phantom Estimators for Waiting Times in the G/G/1 Queue

Discrete Event Dynamic Systems
A stochastic approximation method with max-norm projections and its applications to the Q-learning algorithm

ACM Transactions on Modeling and Computer Simulation (TOMACS)

Quantified Score

Hi-index	22.14

Visualization

Abstract

This paper analyzes the behavior of adaptive control schemes for automatic learning. Estimates of the sensitivities are used in a gradient-based stochastic approximation procedure, in order to drive the process along the steepest descent trajectory in search for the optimum. The learning rates are kept constant for adaptability. For such procedures, convergence can be established in a weak sense. A model problem of a flexible machine is presented, for which the control parameter is a probability vector. We propose a new sensitivity estimator, generalizing the phantom rare perturbation analysis (RPA) estimator to multi-valued decisions. From the basic properties of the estimators, we build several updating rules based on the weak convergence theory to ensure asymptotic optimality. We illustrate the predicted theoretical behavior with computer simulations. Finally, we present the comparison between the behavior of our proposed scheme with a regenerative one for which we can establish strong convergence. Our results show that weak convergence yields a dramatic improvement in the rate of convergence, in addition to the capability of adaptation, or tracking.