Improving gradient-based learning algorithms for large scale feedforward networks

Authors:
M. Ventresca;H. R. Tizhoosh
Affiliations:
Pattern Analysis and Machine Intelligence Laboratory and Systems Design Engineering Department, University of Waterloo, Waterloo, Ontario, Canada;Pattern Analysis and Machine Intelligence Laboratory and Systems Design Engineering Department, University of Waterloo, Waterloo, Ontario, Canada
Venue:
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Year:
2009

Citing 6
Cited 1

Ill-conditioning in neural network training problems

SIAM Journal on Scientific Computing
Neural network design

Neural network design
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
Solving the Ill-Conditioning in Neural Network Learning

Neural Networks: Tricks of the Trade, this book is an outgrowth of a 1996 NIPS workshop
Learning improvement of neural networks used in structural optimization

Advances in Engineering Software
A novel population initialization method for accelerating evolutionary algorithms

Computers & Mathematics with Applications

Survey A review of opposition-based learning from 2005 to 2012

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Large scale neural networks have many hundreds or thousands of parameters (weights and biases) to learn, and as a result tend to have very long training times. Small scale networks can be trained quickly by using second-order information, but these fail for large architectures due to high computational cost. Other approaches employ local search strategies, which also add to the computational cost. In this paper we present a simple method, based on opposite transfer functions which greatly improve the convergence rate and accuracy of gradient-based learning algorithms. We use two variants of the backpropagation algorithm and common benchmark data to highlight the improvements. We find statistically significant improvements in both converegence speed and accuracy.