Generalized Polak-Ribie`re algorithm
Journal of Optimization Theory and Applications
Neural network design
A globally convergent version of the Polak-Ribière conjugate gradient method
Mathematical Programming: Series A and B
Convergence of an online gradient method for feedforward neural networks with stochastic inputs
Journal of Computational and Applied Mathematics - Special issue on proceedings of the international symposium on computational mathematics and applications
Boundedness and convergence of online gradient method with penalty for feedforward neural networks
IEEE Transactions on Neural Networks
When does online BP training converge?
IEEE Transactions on Neural Networks
IEEE Transactions on Neural Networks
IEEE Transactions on Neural Networks
Deterministic convergence of an online gradient method for BP neural networks
IEEE Transactions on Neural Networks
Convergence of Cyclic and Almost-Cyclic Learning With Momentum for Feedforward Neural Networks
IEEE Transactions on Neural Networks
Convergence of chaos injection-based batch backpropagation algorithm for feedforward neural networks
ISNN'13 Proceedings of the 10th international conference on Advances in Neural Networks - Volume Part I
Hi-index | 0.01 |
Conjugate gradient methods have many advantages in real numerical experiments, such as fast convergence and low memory requirements. This paper considers a class of conjugate gradient learning methods for backpropagation neural networks with three layers. We propose a new learning algorithm for almost cyclic learning of neural networks based on PRP conjugate gradient method. We then establish the deterministic convergence properties for three different learning modes, i.e., batch mode, cyclic and almost cyclic learning. The two deterministic convergence properties are weak and strong convergence that indicate that the gradient of the error function goes to zero and the weight sequence goes to a fixed point, respectively. It is shown that the deterministic convergence results are based on different learning modes and dependent on different selection strategies of learning rate. Illustrative numerical examples are given to support the theoretical analysis.