Incremental Gradient Algorithms with Stepsizes Bounded Away from Zero

Authors:
M. V. Solodov
Affiliations:
Instituto de Matemática Pura e Aplicada, Estrada Dona Castorina 110, Jardim Botânico, Rio de Janeiro, RJ 22460-320, Brazil. E-mail: solodov@impa.br
Venue:
Computational Optimization and Applications
Year:
1998

Citing 13
Cited 1

A time-delay neural network architecture for isolated word recognition

Neural Networks
Foundations of neural networks

Foundations of neural networks
Learning internal representations by error propagation

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
On the convergence of the LMS algorithm with adaptive learning rate for linear feedforward networks

Neural Computation
Original Contribution: Optimal filtering algorithms for fast learning in feedforward neural networks

Neural Networks
Convergence analysis of perturbed feasible descent methods

Journal of Optimization Theory and Applications
Improved generalization via tolerant training

Journal of Optimization Theory and Applications
Error stability properties of generalized gradient-type algorithms

Journal of Optimization Theory and Applications
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Neural Networks for Optimization and Signal Processing

Neural Networks for Optimization and Signal Processing
Incremental Least Squares Methods and the Extended Kalman Filter

SIAM Journal on Optimization
An Incremental Gradient(-Projection) Method with Momentum Term and Adaptive Stepsize Rule

SIAM Journal on Optimization
A New Class of Incremental Gradient Methods for Least Squares Problems

SIAM Journal on Optimization

Incremental Subgradients for Constrained Convex Optimization: A Unified Framework and New Methods

SIAM Journal on Optimization

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the class of incremental gradient methods for minimizing a sum of continuously differentiable functions. An importantnovel feature of our analysis is that the stepsizes are kept bounded awayfrom zero. We derive the first convergence results of any kind for thiscomputationally important case. In particular, we show that a certainϵ-approximate solution can be obtained and establish the lineardependence of ϵ on the stepsize limit. Incremental gradient methodsare particularly well-suited for large neural network training problemswhere obtaining an approximate solution is typically sufficient and is oftenpreferable to computing an exact solution. Thus, in the context of neuralnetworks, the approach presented here is related to the principle oftolerant training. Our results justify numerous stepsize rules that werederived on the basis of extensive numerical experimentation but for which notheoretical analysis was previously available. In addition, convergence to(exact) stationary points is established when the gradient satisfies acertain growth property.