Training multilayer perceptrons with the extended Kalman algorithm
Advances in neural information processing systems 1
Numerical recipes in C (2nd ed.): the art of scientific computing
Numerical recipes in C (2nd ed.): the art of scientific computing
Fast exact multiplication by the Hessian
Neural Computation
Additive versus exponentiated gradient updates for linear prediction
STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
Dynamics and algorithms for stochastic search
Dynamics and algorithms for stochastic search
Natural gradient works efficiently in learning
Neural Computation
Incorporating curvature information into on-line learning
On-line learning in neural networks
A fast, compact approximation of the exponential function
Neural Computation
Neural Networks for Pattern Recognition
Neural Networks for Pattern Recognition
Local Gain Adaptation in Stochastic Gradient Descent
Local Gain Adaptation in Stochastic Gradient Descent
Online Independent Component Analysis With Local Learning Rate Adaptation
Online Independent Component Analysis With Local Learning Rate Adaptation
On-line learning in changing environments with applications in supervised and unsupervised learning
Neural Networks - Computational models of neuromodulation
Stable Adaptive Momentum for Rapid Online Learning in Nonlinear Systems
ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
Conjugate Directions for Stochastic Gradient Descent
ICANN '02 Proceedings of the International Conference on Artificial Neural Networks
ICML '06 Proceedings of the 23rd international conference on Machine learning
Accelerated training of conditional random fields with stochastic gradient methods
ICML '06 Proceedings of the 23rd international conference on Machine learning
Step Size Adaptation in Reproducing Kernel Hilbert Space
The Journal of Machine Learning Research
A Very Fast Learning Method for Neural Networks Based on Sensitivity Analysis
The Journal of Machine Learning Research
Limited stochastic meta-descent for kernel-based online learning
Neural Computation
Segmented-memory recurrent neural networks
IEEE Transactions on Neural Networks
Linear least-squares based methods for neural networks learning
ICANN/ICONIP'03 Proceedings of the 2003 joint international conference on Artificial neural networks and neural information processing
3D hand tracking in a stochastic approximation setting
Proceedings of the 2nd conference on Human motion: understanding, modeling, capture and animation
A convolutional learning system for object classification in 3-D lidar data
IEEE Transactions on Neural Networks
A stochastic conjugate gradient method for the approximation of functions
Journal of Computational and Applied Mathematics
An Optimization Rule for In Silico Identification of Targeted Overproduction in Metabolic Pathways
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 0.00 |
We propose a generic method for iteratively approximating various second-order gradient steps--Newton, Gauss-Newton, Levenberg-Marquardt, and natural gradient--in linear time per iteration, using special curvature matrix-vector products that can be computed in O(n). Two recent acceleration techniques for on-line learning, matrix momentum and stochastic meta-descent (SMD), implement this approach. Since both were originally derived by very different routes, this offers fresh insight into their operation, resulting in further improvements to SMD.