2008 Special Issue: Second-order stagewise backpropagation for Hessian-matrix analyses and investigation of negative curvature

Authors:
Eiji Mizutani;Stuart E. Dreyfus
Affiliations:
Department of Industrial Management, National Taiwan University of Science and Technology, Taipei, 106, Taiwan;Department of Industrial Engineering and Operations Research, University of California at Berkeley, CA 94720, USA
Venue:
Neural Networks
Year:
2008

Citing 10
Cited 2

Ill-conditioning in neural network training problems

SIAM Journal on Scientific Computing
Advanced algorithms for neural networks: a C++ sourcebook

Advanced algorithms for neural networks: a C++ sourcebook
Neural networks: a systematic introduction

Neural networks: a systematic introduction
XOR has no local minima: a case study in neural network error surface analysis

Neural Networks
The error surface of the 2-2-1 XOR network: the finite stationary points

Neural Networks
The local minima of the error surface of the 2-2-1 XOR network

Annals of Mathematics and Artificial Intelligence
On Derivation of MLP Backpropagation from the Kelley-Bryson Optimal-Control Gradient Formula and Its Application

IJCNN '00 Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks (IJCNN'00)-Volume 2 - Volume 2
On structure-exploiting trust-region regularized nonlinear least squares algorithms for neural-network learning

Neural Networks - 2003 Special issue: Advances in neural networks research — IJCNN'03
Singularities Affect Dynamics of Learning in Neuromanifolds

Neural Computation
Advanced neural-network training algorithm with reduced complexity based on Jacobian deficiency

IEEE Transactions on Neural Networks

A second-order learning algorithm for computing optimal regulatory pathways

PerMIn'12 Proceedings of the First Indo-Japan conference on Perception and Machine Intelligence
An Optimization Rule for In Silico Identification of Targeted Overproduction in Metabolic Pathways

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multi-stage feed-forward neural network (NN) learning with sigmoidal-shaped hidden-node functions is implicitly constrained optimization featuring negative curvature. Our analyses on the Hessian matrix H of the sum-squared-error measure highlight the following intriguing findings: At an early stage of learning, H tends to be indefinite and much better-conditioned than the Gauss-Newton Hessian J^TJ. The NN structure influences the indefiniteness and rank of H. Exploiting negative curvature leads to effective learning. All these can be numerically confirmed owing to our stagewise second-order backpropagation; the systematic procedure exploits NN's ''layered symmetry'' to compute H efficiently, making exact Hessian evaluation feasible for fairly large practical problems.