Rate of Convergence for Constrained Stochastic Approximation Algorithms

Authors:
Robert Buche;Harold J. Kushner
Affiliations:
-;-
Venue:
SIAM Journal on Control and Optimization
Year:
2001

Citing 0
Cited 3

A Stochastic Algorithm for Feature Selection in Pattern Recognition

The Journal of Machine Learning Research
Distributed algorithm for minimizing delay in multi-hop wireless sensor networks

WiOPT'09 Proceedings of the 7th international conference on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks
Survey paper: A partial history of the early development of continuous-time nonlinear stochastic systems theory

Automatica (Journal of IFAC)

Quantified Score

Hi-index	0.00

Visualization

Abstract

There is a large literature on the rate of convergence problem for general unconstrained stochastic approximations. Typically, one centers the iterate $\theta_n$ about the limit point $\bar\theta$ and then normalizes by dividing by the square root of the step size $\epsilon_n$. Then some type of convergence in distribution or weak convergence of Un, the centered and normalized iterate, is proved. For example, one proves that the interpolated process formed by the Un, converges weakly to a stationary Gaussian diffusion, and the variance of the stationary measure is taken to be a measure of the rate of convergence. See the references in [A. Benveniste, M. Metivier, and P. Priouret, Adaptive Algorithms and Stochastic Approximation, Springer-Verlag, Berlin, New York, 1990; L. Gerencér, SIAM J. Control Optim., 30 (1992), pp. 1200--1227; H. J. Kushner and D. S. Clark, Stochastic Approximation for Constrained and Unconstrained Systems, Springer-Verlag, Berlin, New York, 1978; H. J. Kushner and G. Yin, Stochastic Approximation Algorithms and Applications, Springer-Verlag, Berlin, New York, 1997; M. T. Wasan, Stochastic Approximation, Cambridge University Press, Cambridge, UK, 1969] for algorithms where the step size either goes to zero or is small and constant. Large deviations provide an alternative approach to the rate of convergence problem [P. Dupuis and H. J. Kushner, SIAM J. Control Optim., 23 (1985), pp. 675--696; P. Dupuis and H. J. Kushner, SIAM J. Control Optim., 27 (1989), pp. 1108--1135; P. Dupuis and H. J. Kushner, Probab. Theory Related Fields, 75 (1987), pp. 223--244; A. P. Korostelev, Stochastic Recurrent Processes, Nauka, Moscow, 1984; H. J. Kushner and G. Yin, Stochastic Approximation Algorithms and Applications, Springer-Verlag, Berlin, New York, 1997]. When the iterates of the algorithm are constrained to lie in some bounded set, the limit point is frequently on the boundary. With the exception of the large deviations type [P. Dupuis and H. J. Kushner, SIAM J. Control Optim., 23 (1985), pp. 675--696; P. Dupuis and H. J. Kushner, Probab. Theory Related Fields, 75 (1987), pp. 223--244], the rate of convergence literature is essentially confined to the case where the limit point is not on a constraint boundary.When the limit point is on the boundary of the constraint set the usual steps are hard to carry out. In particular, the stability methods which are used to prove tightness of the normalized iterates cannot be carried over in general, and there is the problem of proving tightness of the normalized process and characterizing the limit process. This paper develops the necessary techniques and shows that the stationary Gaussian diffusion is replaced by an appropriate stationary reflected linear diffusion, whose variance plays the same role as a measure of the rate of convergence. An application to constrained function minimization under inequality constraints $q^i(x)\le 0,i\le p$, is given, where both the objective function and the constraints are observed in the presence of noise. The iteration is on both the basic state variable and a Lagrange multiplier, which is constrained to be nonnegative. If a limit multiplier value for an active constraint is zero, then the classical method for computing the rate cannot be used, but (under appropriate conditions) it is a special case of our results. Rate of convergence results are important because, among other reasons, they immediately yield the advantages of iterate averaging methods, as noted in [H. J. Kushner and G. Yin, Stochastic Approximation Algorithms and Applications, Springer-Verlag, Berlin, New York, 1997].