Fast Curvature Matrix-Vector Products

Authors:
Nicol N. Schraudolph
Affiliations:
-
Venue:
ICANN '01 Proceedings of the International Conference on Artificial Neural Networks
Year:
2001

Citing 7
Cited 0

Training multilayer perceptrons with the extended Kalman algorithm

Advances in neural information processing systems 1
Fast exact multiplication by the Hessian

Neural Computation
Additive versus exponentiated gradient updates for linear prediction

STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
Natural gradient works efficiently in learning

Neural Computation
A fast, compact approximation of the exponential function

Neural Computation
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Fast Second-Order Gradient Descent via O(n) Curvature Matrix-Vector Products

Fast Second-Order Gradient Descent via O(n) Curvature Matrix-Vector Products

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Gauss-Newton approximation of the Hessian guarantees positive semi-definiteness while retaining more second-order information than the Fisher information.We extend it from nonlinear least squares to all differentiable objectives such that positive semi-definiteness is maintained for the standard loss functions in neural network regression and classification. We give efficient algorithms for computing the product of extended Gauss-Newton and Fisher information matrices with arbitrary vectors, using techniques similar to but even cheaper than the fast Hessian-vector product [1]. The stability of SMD [2,3,4,5], a learning rate adaptation method that uses curvature matrix-vector products, improves when the extended Gauss-Newton matrix is substituted for the Hessian.